Middle chain-specific thioesterase genes from Cuphea lanceolata

ABSTRACT

The present invention is directed to DNA sequences that from Cuphea lanceolata code for a middle chain-specific acyl- ACP!-thioesterase, and alleles and derivatives of these DNA sequences. The present invention also is directed to process for producing plants, parts of plants or plant products that contain these DNA sequences, alleles or derivative of these DNA sequences, where the plants, parts of plants or plant products produce fatty acids of middle chain length.

FIELD OF THE INVENTION

This invention concerns DNA sequences which code for a middle chain-specific acyl- ACP!-thioesterase, as well as alleles and derivatives of these DNA sequences.

BACKGROUND OF THE INVENTION

The thioesterases are substantially involved in the production of fatty acids in plant organisms. With respect to compartments the fatty acid and triacylglyceride biosynthesis can be viewed as separate biosynthesis. In view of the end product, they can be viewed as a single biosynthetic pathway. De novo biosyntheses of fatty acids is taking place in the plastids and is catalyzed by three enzymes or enzyme systems respectively, acetyl-coA Carboxylase (ACCase), the fatty acid synthase (FAS), and the acyl- ACP!-thioesterase (TE).

In most organisms the end products of these reactive pathways are either palmitic acid (C_(16:0)), stearic acid (C_(18:0)) and, after desaturation oleic acid (Δ9C_(18:1)). The acyl- ACP!-thioesterase (TE) flnctions in the determination of the length of the chain.

In contrast, triacylglyceride biosynthesis takes places at the endoplasmic reticulum in the cytoplasm via the so-called "Kennedy Pathway" from glycerin-3-phosphate, which is probably provided as a result of the activity of glycerin-3-phosphate dehydrogenase (G3P-DH), and fatty acids, which occur as acyl-coA substrates.

In animal systems (e.g. the rat), the acyl- ACP!-thioesterase is an integral part of the FASI and is responsible for the termination of the fatty acid biosynthesis there. A second acyl- ACP!-thioesterase (TEII), which is expressed in specific tissues, is responsible for the early termination of chain elongation in the milk producing glands of the rat breast, and causes the release of C_(10:0) and C_(12:0) fatty acids. Expression of this TEII in mouse fibroblasts resulted in the formation of these middle chain fatty acids in these cells. It is therefore concluded, that this enzyme is significantly involved in the termination of chain length. (S. A. Bayley et al., Bio/Technology 6, p. 1219-1221 (1988)).

Acyl- ACP!-thioesterases were also purified from plants, and analyzed for their activity. Acyl- ACP!-thioesterases with preference for the hydrolysis of long chain acyl- ACP! compounds were isolated from Carthamus tinctorius (T. A. McKeon et al., J. Biol. Chem. 257, p. 12141-12147 (1982)), Cucurbita moschata (H. Imai et al., Plant Mol. Biol. 20, p. 199-206 (1992)), and Brassica napus (A. Hellyer et al., Plant Mol. Biol. 20, p. 763-780 (1992)). Corresponding cDNAs have been isolated already from Carthamus tinctorius (D. S. Knutzon et al., Plant Physiol. 100, p. 1751-1758 (1992)) and Brassica napus (E. S. Loader et al., Plant Mol. Biol. 23, p. 769-778 (1993)). Another TE with specificity for the hydrolysis of C_(12:0) - ACP! has been isolated from Umbellularia Californica (California Laurel), and was separated from the activity of a C_(18:0) - ACP! specific TE (M. R. Pollard et al., Art. Biochem. Biophys. 284, p. 306-312 (1991)). In Cuphea lanceolata, the activity of a middle and a long chain-specific TE were detected as well (P. Dormann et al., Planta 189, p. 425-432 (1993)).

An only partially purified enzyme preparation of a C_(10:0) specific acyl- ACP!-thioesterase from Cuphea hookeriana is described in WO 91/16421. As measurements of the hydrolysis activities of the enzyme shows against various substrates, it contains significant amounts of activity which are not C10:0 specific.

For the TE from Umbellularia Californica a cDNA was isolated which codes for a middle chain-specific acyl- ACP!-thioesterase. This TE caused the formation of middle chain fatty acids in seeds of transgenic Arabidopsis thaliana and B. napus plants, in particular lauric acid (C12:0) and in small amounts myristic acid (C14:0); (T. A. Voelker et al., Science 257, p. 72-74 (1992) and H. M. Davies and T. A. Voelker in Murata, N. and C. Somerville (editors): Current Topics in Plant Physiology: Biochemistry and Molecular Biology of Membrane and Storage Lipids of Plants, Vol 9, p. 133-137; American Society of Plant Physiologists, Rockville (1993)).

There is an increasing demand on the supply of middle chain fatty acids, e.g. capric acid (C_(10:0)), which can be used in industry as softeners, lubricants, pesticides, tensides, cosmetics, etc. One possibility to make these fatty acids available is in the isolation (extraction) of fatty acids from plants which show especially high contents of these fatty acids. The increase of content of middle chain fatty acids was achieved only to a limited extent by the classic method, which is the breeding of plants which produce elevated levels of these fatty acids.

Therefore it is the goal of this invention to provide genes or DNA sequences, which can be used to improve the yield of oils and the production of middle chain fatty acids in plants, which are not capable of producing these fatty acids themselves or only in small amounts.

SUMMARY OF THE INVENTION

This goal is achieved with the DNA sequences according to patent claim 1 or the genes from the genomic clones according to patent claim 6.

This invention concerns DNA sequences which code for a middle chain-specific acyl- ACP!-thioesterase, and the alleles and derivatives of these DNA sequences.

Furthermore this invention concerns genomic clones, the DNA sequences which code for a middle chain-specific acyl- ACP!-thioesterase and which contain promoters and regulatory sequences, and the alleles as well as the derivatives of these DNA sequences.

Furthermore, this invention concerns a process for the production of plants, parts of plants, and plant products, in which a DNA sequence , which codes for a middle chain-specific acyl- ACP!-thioesterase, is transferred by means of gene technology.

Finally, this invention concerns the use of this DNA sequence for the transfer of genes for middle chain-specific acyl- ACP!-thioesterases in plants.

SUMMARY OF THE FIGURES

The figures serve to explain the invention presented here. It is shown by:

FIG. 1 the presentation of the DNA and amino acid sequence of the degenerate oligonucleotides 3532 and 2740;

FIG. 2 the restriction maps of the genomic clones for the acyl- ACP!-thioesterase ClTEg1, ClTEg16, ClTEg4 and ClTEg7 from Cuphea lanceolata;

FIGS. 3A-3I a comparison of the amino acid sequences of thioesterases from various plants;

FIG. 4 functional parts of binary vectors for the expression of the claimed DNA sequences and genes from the genomic clones in transgenic plants;

FIG. 5 the gas chromatogram of the contents of fatty acids in unripe rape seeds (pNBM99-2TE);

FIG. 6 the gas chromatogram of the contents of fatty acids in unripe tobacco seeds (pNBM99-2TE);

FIG. 7 the gas chromatogram of the contents of fatty acids in ripe raps seeds (pNBM99-TEg1); and

FIG. 8 the gas chromatogram of the contents of fatty acids in ripe rape seeds (pNBM99-TEg16).

DETAILED DESCRIPTION OF THE INVENTION

It is self-evident, that allelic variants and derivatives of the DNA sequences, which are in accordance with the patent, are included in the scope of the invention, under the condition, that these modified DNA sequences and genes code for middle chain-specific acyl- ACP!-thioesterases. For instance, deletions, substitutions, insertions, inversions or additions of the DNA sequences, which are in accordance with this invention, are considered to be allelic variants and derivatives.

The DNA sequences which are in accordance with this invention, as well as the genes from the genomic clones code for middle chain-specific acyl- ACP!-thioesterases, which catalyze the formation of C_(8:0) to C_(14:0) fatty acids. In particular the invention applies to C_(10:0) specific acyl- ACP!-thioesterases or for the most part C_(10:0) specific acyl- ACP!-thioesterases and C_(14:0) specific acyl- ACP!-thioesterases or for the most part C_(14:0) specific acyl- ACP!-thioesterases, which are responsible for the formation of capric acid and myristic acid in fatty acid synthesis.

Any plant material, which produces these thioesterases in sufficient amounts, is suitable as the starting material for the isolation of genes which code for middle chain-specific acyl- ACP!-thioesterases. The plant Cuphea lanceolata, which originates in Central America, has proven to to be an especially suitable starting material in the invention disclosed here. The seeds of this plant contain 83% capric acid.

In order to isolate DNA sequences in accordance with the patent, a cDNA library from Cuphea lanceolata (wild type) was searched for genes for middle chain-specific acyl- ACP!-thioesterases with a hybridization probe, PCR42, which was obtained by PCR (Polymerase Chain Reaction). In this way, the cDNA-clones ClTE13, ClTE5 and ClTE12 were isolated.

The thus obtained cDNAs were fully sequenced in both directions in the usual way. The ClTE13-cDNA contains 1494 bp as an ApaI-EcoRI fragment, and contains the total structural gene for a middle chain-specific acyl- ACP!-thioesterases. The ClTE13-cDNA codes for a protein with 414 amino acids, which includes a deduced transit peptide of 111 amino acids. The full DNA sequence of the 1494 bp cDNA-fragment with the deduced amino acid sequence is presented as SEQ ID NO: 1 in the sequence listing. The coding region stretches from position 83 to position 1324 of the DNA sequence. The open reading frame begins at position 83 with the start codon "ATG", which codes for methionine, and ends at position 1324 with the stop codon "TAG". The deduced molecular weight of the mature protein is 34 kDa.

The DNA sequence analysis of the two other cDNAs ClTE5 and ClTE12 has shown that these do not contain the total structural gene for a middle chain-specific acyl- ACP!-thioesterase. The DNA sequences of the named cDNAs, including the deduced amino acid sequences, are listed in the sequence listing as SEQ ID NO:2 and SEQ ID NO:3. The ClTE5-cDNA has a length of 1404 bp as an EcoRI-XhoI-fragment and codes in the open reading frame for a protein with 375 amino acids, in which 34 amino acids of the transit peptide are missing relative to the deduced amino acid sequence of ClTE13. The ClTE12-cDNA has a length of 1066 bp as an EcoRI-XhoI-fragment, where the XhoI site is situated at the end, and codes according to the open reading frame for a protein of 287 amino acids, in which 20 amino acids of the mature protein and the transit peptide are missing.

In the following Table I the level of homology or identity is given between the acyl- ACP!-thioesterase amino acid sequences of mature proteins of the ClTE5 and ClTE13 cDNAs (deduced from the DNA sequence) from Cuphea lanceolata, CtTE2-1 and CtTE5-2 from Carthamus tinctorius, and UcTE from Umbellularia californica.

                  TABLE I     ______________________________________     Percent Identity     ClTE5        ClTE13   CtTE2-1  CtTE5-2 UcTE     ______________________________________     ClTE5            91.3%    44.8%  48.2%   57.0%     ClTE13  96.0%             44.2%  45.7%   57.9%     CtTE2-1 67.1%    67.6%           82.5%   39.7%     CtTE5-2 71.1%    69.5%    91.1%          41.6%     UcTE    75.1%    76.3%    63.8%  62.7%     Percent Homology     ______________________________________

The comparison of the TE amino acid sequence from ClTE 13 with the thioesterase from U. californica (UcTE) shows a rather high agreement at 57.9% identical amino acids, which is higher than the agreement to the long chain-specific thioesterases from C.tinctorius (CtTE-2-1 and CtTE5-2). The thioesterase of ClTE5 shows a rather high agreement with UcTE at 57.0% identity.

FIG. 3 shows an amino acid sequence comparison between thioesterases from plants. The sequences of the mature proteins (exception: ClTE12, --20 amino acids) are deduced from the corresponding thioesterase (TE) cDNAs from Carthamus tinctorius=Ct, Cuphea lanceolata=Cl, Brassica napus=Bn and Umbellularia californica=Uc. PCR42 is the PCR product that was used in the screening of the cDNA libraries. The gap between positions 374 and 393 (ca. 20 amino acids) occurs only in the middle chain-specific thioesterases, and is close before the sole cysteine (position 359) which is conserved throughout all the sequences, which is presumed to be the active cysteine residue. By changing the subsequence between the named positions and others, see below, the chain length specificity of thioesterases can be influenced through genetic engineering.

Furthermore, genomic clones were isolated and characterized from Cuphea lanceolata, which contain the full-length structural gene of a middle chain-specific acyl- ACP!-thioesterases including regulatory sequences (as promoters and terminators). This means that they form fully functional transcriptional units. During screening of a genomic library from Cuphea lanceolata with ClTE5-cDNA as a probe, 23 genomic clones were isolated. The genomic clones ClTEg1, ClTEg16, ClTEg4, and ClTEg7 are shown in FIG. 2 and characterized by restriction analysis with various restriction enzymes. The DNA-fragments in question show a size of 12.7 kb for ClTEg1, 17.4 kb for ClTEg16, 13.5 kb for ClTEg4 and 14.7 kb for ClTEg7. The restriction mapping has concluded that the shown genomic clones belong to four different classes of genes. It was determined from sequencing data, that the cDNA ClTE5 corresponds to the gene of the genomic clone ClTEg4, the ClTE12-cDNA the gene of the genomic clone ClTEg1, the ClTE13-cDNA the gene of the genomic clone ClTEg7, and the PCR product PCR42 to the gene from the genomic clone ClTEg16.

Internal sequence primers, positioned at the 5'-end, were deduced from the cDNA sequences described above. These primers were used to obtain sequence data from the genomic clones, which give information about the start of the coding region and also about the limits of the promoters of the thioesterase gene. As a result of these diagnostic sequence regions of the genomic clones ClTEg1, ClTEg16, ClTEg4 and ClTEg7 in the area of the smallest hybridizing fragments (see black bar in FIG. 2), it was possible to establish apart from the identity as genes for middle chain-specific thioesterases in comparison with the amino acid sequence of U.californica thioesterase, also the completeness of the thioesterase gene as transcriptional units.

The thioesterase genes were identified by DNA sequence analysis of selected sequence fragments of the genomic clones ClTEg1, ClTEg4, ClTEg7, and ClTEg16. The sequenced regions are recognizable as white bars under the clones shown in FIG. 2. All genes consist of seven exons, where the first exon is not in the area of the mRNA that is translated. The structural gene of a middle chain-specific acyl- ACP!-thioesterase is located on a 4098 bp DNA-fragment of clone ClTEg1, see SEQ ID NO:4 in the sequence listing. The coding region starts with exon II at position 1787 and ends with exon VII at position 3941. A 4643 bp DNA-fragment of the clone ClTEg7 contains the structural gene of a middle chain-specific acyl- ACP!-thioesterase. As can be seen from SEQ ID NO:6 in the sequence listing, the coding region begins at position 773 with exon II and ends with exon VII at position 3118. The genomic clone ClTEg16 contains the structural gene for a middle chain-specific acyl- ACP!-thioesterase on a 5467 bp DNA-fragment. See SEQ ID NO:7 in the sequence listing. The coding region begins with exon II at position 3284 and ends with exon VII at position 5275. The coding region for the structural gene of a middle chain-specific acyl- ACP!-thioesterase is incomplete for genomic clone ClTEg4. SEQ ID NO:5 in the sequence protocol shows exon II at positions 1 through 502 as well as the incomplete intron II at positions 503 through 928 on a 928 bp DNA fragment.

The structural genes for the middle chain-specific acyl- ACP!-thioesterases, which were detected in the genomic clones ClTEg1, ClTEg7, and ClTEg16, each contain seven exons of almost identical size. Exon II of the thioesterase from clone ClTEg4 falls into the same order of size as the exons II of the other thioesterases. It is possible that intron I of all genes is responsible for regulatory functions in gene expression.

The genomic clone ClTEg4 was deposited under number DSM 8493, and the genomic clone ClTEg7 under the number DSM 8494, on Aug. 27, 1993 at the DSM-Deutsche Sammlung von Mikroorganismen und Zeflkulturen GmbH (DSM--German Collection of Microorganisms and Cell Cultures, Inc.), Mascheroder Weg 1B, D-38124 Braunschweig.

The DNA sequences, which are in accordance with the patent, which code for a middle chain-specific acyl- ACP!-thioesterase, can be introduced/transferred into plants by application of gene technological procedures, and can lead to production of these fatty acids in these plants (in the form of anti-sense or over expression). The DNA sequences, which are in accordance with the patent, are introduced into plants in particular with recombinant vectors, for instance binary vectors, preferably together with suitable promoters, unless they consist as complete transcriptional units.

The genomic clones ClTEg1, ClTEg16, ClTEg4 and ClTEg7 can be used as self-complete transcriptional units (which contain promotor, structural gene, and teminator) for the transformation of plants, whereby middle chain-specific fatty acids are accumulated in the storage lipids. The yield of middle chain fatty acids can be optimized by crossings and the resultant combination of thioesterase genes. An optimization can take place by increasing the content of newly introduced fatty acids, or by production of various new fatty acids.

All varieties of plants can be transformed for this purpose. Preferably, such plants are being transformed, which are supposed to show an increased production of middle chain-specific fatty acids, and such plants as do not naturally synthesize these fatty acids. In this context oil plants, for instance rapeseed, sunflower, flax, oil palm and soybean are to be named.

The gene technological introduction of DNA sequences, which are in accordance with the patent, which code for acyl- ACP!-thioesterase, can be made with the aid of the usual transformation techniques. Such techniques include procedures like direct gene transfer, for instance micro injection, electroporation, particle gun, the soaking of parts of plants in DNA solutions, pollen or pollen tube transformation, viral vectors and liposome-facilitated transfer, as well as the transmission of the appropriate recombinant Ti-plasmids or Ri-plasmids with Agrobacterium tumefaciens and the transformation with plant viruses.

In the present invention, the ClTE13-cDNA was first introduced into the binary vector pRE9 (pNBM99-3TE) as an ApaI-EcoRI fragment behind a double promotor constructed from 35S RNA from Cauliflower Mosaic Virus (p35S/ΔP35S), second the same fragment was introduced in the binary vector pRE1 (pNBM99-2TE), behind the seed specific ACP23-promotor from rapeseed. Furthermore, a XbaI-fragment (7.3 kb) from the genomic clone ClTEg1 and a SalI-EcoRI-fragment (6 kb) from the genomic clone ClTEg16 were introduced into pRE1. The resulting binary vectors are pNBM99-TEg1 and pNBM99-TEg16. The TE gene from ClTEg1 is in 3'-5' orientation in pRE 1, and the gene from ClTEg16 in 5'-3' orientation. The functional parts of the thus obtained expression vectors are shown in FIG. 4.

The abbreviations used in FIG. 4 have the following meaning:

RB and LB=right and left border of the transfer DNA

pACP23=promotor of the Acyl-carrier-protein gene 23 from rape

ClTE13=CDNA 13 from Cuphea lanceolata

tnos=terminator of the Nopalinsynthase gene from Agrobacterium tumefaciens

NPTII=Neomycinphosphotransferase gene II

p35S=promotor of the 35S RNA of the cauliflower mosaic virus

t35S=ten-ninator of the 35S RNA of the cauliflower mosaic virus

35S=minimal promotor of the 35S RNA of the cauliflower mosaic virus

The middle chain specificity of cDNA, which is in accordance with the patent, as well as the genes from the genomic clones, was analyzed by transformation. Appropriate plant materials are, for instance, rape and tobacco, since these plants are capable of producing only longer chain fatty acids, starting at C16:0 and up.

To this purpose the expression vectors pNBM99-TEg1 and pNBM99-TEg16 were transformed, independently of each other, into rape through Agrobacterium. On Aug. 27, 1993 the expression vector pNBM-TEg1 was deposited with the DSM (Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH, Mascheroder Weg 1B, D-38124 Braunschweig) under number DSM 8477, and the expression vector pNBM99-TEg16 under number DSM 8478.

Independent of each other, the expression vectors pNBM99-2TE and pNBM99-3TE were transformed into tobacco with Agrobacterium tumefaciens.

Then the transformed rape and tobacco plants were analyzed for their content of middle chain fatty acids. For this purpose, ripening seeds were analyzed by gas chromatography. FIGS. 5 and 6 show gas chromatograms from fatty acid extracts from transgenic rape and tobacco seeds, which had been transformed with the pNBM99-2TE construct.

As can be seen on the gas chromatograms, transgenic rape as well as transgenic tobacco are producing capric acid (C_(10:0)). Therefore it can be concluded that the cDNA ClTE13 and the gene from the genomic clone ClTEg7 (see above) code for a thioesterase, which is specific for C_(10:0) or for the most part specific for C_(10:0).

It was determined in further studies on transgenic ripe rapeseeds, which were transformed with the expression vectors pNBM99-TEg1 and pNBM99-TEg16, that in the gas chromatograms of the fatty acid extracts the following amounts are produced: in the case of the gene from the genomic clone ClTEg1 (FIG. 7) 1.7% capric acid (C_(10:0)) and 0.4% caprylic acid (C_(8:0)), and in the case of the gene from the genomic clone ClTEg16 (FIG. 8) 5.4% myristic acid (C_(14:0)). The following table II shows the change in the fatty acid patterns of the transgenic rape plants in this study. The values refer to the %-fraction of fatty acid in ripe seed.

                                      TABLE II     __________________________________________________________________________     Construct             C.sub.8               C.sub.10                  C.sub.12                     C.sub.14                        C.sub.16                           C.sub.18:0                              C.sub.18:1                                 C.sub.18:2                                    C.sub.18:3                                       C.sub.20:0     __________________________________________________________________________     Control --               -- -- -- 3.0                           2.3                              76.5                                 9.9                                    6.0                                       0.8     pNBM99-TEg1             0.4               1.7                  -- -- 3.4                           2.2                              75.4                                 8.2                                    6.5                                       0.8     pNBM99-TEg16             --               -- -- 5.4                        13.4                           1.9                              56.6                                 13.7                                    7.1                                       0.7     __________________________________________________________________________

It follows, that the gene on the genomic clone ClTEg1 and the cDNA ClTE12 (see above) code for a C_(10:0) specific acyl- ACP!-thioesterase or an for the most part C_(10:0) specific acyl- ACP!-thioesterase. The gene on the genomic clone ClTEg16 codes for a C_(14:0) specific acyl- ACP!-thioesterase or a for the most part C_(14:0) specific acyl- ACP!-thioesterase. On comparing the amino acid sequences in FIG. 3 it can be noticed that the C10/C14 difference of ClTEg1 and ClTEg16 can be attributed to minor sequence variations "RR" (positions 395/396) and "D" (position 398), etc. Furthermore there is a gap (five amino acids) and amino acid changes in the area of positions 127 through 135. These regions may influence the chain length limits.

The DNA sequences, which are in accordance with the patent, and the genes from the genomic clones isolated from Cuphea lanceolata are ideally suited to confer to transformed plants the capability to produce middle chain specific fatty acids. This means, that it is possible to confer a fully functional gene for a middle chain specific acyl- ACP!-thioesterase. In gas chromatographic studies of transgenic rape and tobacco the formation of capric acid and myristic acid was proven by the transfer of genes for a C_(10:0) or C_(14:0) specific acyl- ACP!-thioesterase, or an for the most part C_(10:0) or C_(14:0) specific acyl- ACP!-thioesterase. It turned out that the cDNA, which is in accordance with this invention, as well as the genes from the disclosed genomic clones do not cause problems with tolerance in rape and tobacco. The proper compartmentalization is insured because of the available transit peptide. For the regulated expression of the ClTE13-cDNA it is possible to use a seed specific expressing promotor, and the genes from the shown genomic clones are themselves regulated by their own tissue specific promoters. Position effects can be balanced out by the necessary number of tranformants.

Therefore, the DNA sequences, which are in accordance with this invention, in form of the disclosed cDNAs as well as in form of the isolated genes from the shown genomic clones, are appropriate for the production of middle chain specific fatty acids in transgenic plants, preferably oil plants. An optimization of the content of middle chain specific fatty acids can be accomplished by additional transfer into rape of components of the fatty acid synthase systems, e.g. DNA sequences for ACP2, a specific KAS, keto reductase and enoyl reductase, for instance from Cuphea lanceolata. Over and above that it can be expected that the LPA-AT (lysophosphatide-acyl-transferase), e.g. from Cuphea lanceolata, which is located in the cytoplasm, can cause a marked increase in the content of middle chain fatty acids in the triacylglycerides in rape.

The following examples serve the explanation of this invention.

EXAMPLES

The plant material, which was used in the frame of the presented invention, consisted of the varieties Brassica napus (Cruciferae) (rape), Nicotiana tabacum (Solanaceae) (tobacco), and Cuphea lanceolata (Lythraceae) (lancett leaved quiver flower). The summer rape line Drakkar and the tobacco line Petit Havanna SRI were used for transformation.

Example 1

Production of cDNAs of Acyl- ACP!-Thioesterase from Cuphea Lanceolata

First a cDNA library was constructed from Cuphea lanceolata (wild-type). This cDNA library was constructed according to the specifications of the producer (Stratagene) with the help of the cDNA ZAP-synthesis kit. As starting material for the synthesis of the cDNAs served polyA+-mRNA from isolated two to three week old immature embryos. The thus obtained cDNA library has a size of 9.6×105 recombinant phages with a fraction of about 50% of clones, which contain insertions of more than 500 bp.

For the screening of the cDNA library described above, a specific hybridization probe was constructed for the acyl- ACP!-thioesterase. In order to accomplish this, proper oligonucleotides are necessary. Voelker et al. (1992), Science 257, p. 72-74 describe the DNA sequence of a plant acyl- ACP!-thioesterase. From a few areas of the sequence, which are as little degenerated as possible, oligonucleotide primers were deduced and synthesized. The primer 3532, which corresponds to the amino acids 277-284 of the acyl- ACP!-thioesterase from Umbellularia californica, is appropriate for amplification of a specific hybridization probe in PCR reactions in conjunction with the primer 2740 (a modified oligo-dT-primer with restriction sites for the enzymes BstBI, BarnHI, HindIII, and SalI).

FIG. 1 and SEQ ID NO:9 and SEQ ID NO:10 in the sequence listing show the sequences of the synthetic oligonucleotides primers 3532 and 2740, which were used for the PCR reaction. The orientation of the oligonucleotide primers is from 5' to 3' for primer 3532, and 3' to 5' for the primer 2740.

A cDNA synthesis was carried out at 37° C. for 30 minutes, starting with 1 ug poly A+-RNA with reverse transcriptase (Boehringer Mannheim GmbH) from Avian Myeloblastosis Virus (AMV). To this purpose the 3'-oligonucleotide primer (2740), which is shown in FIG. 1, was used. After inactivation of the reverse transcriptase by heating for 5 minutes at a temperature of 95° C., the PCR reaction was carried out in the same sample with 50 pmol end concentration per primer and 4 units of Ampli-Taq-Polymerase (perkin Elmer Cetus). The reactions took place under the following conditions: a) buffer conditions: 10 mM Tris-HCl, pH=8.0; 50 mM KCl; 1.5 mM MgCl₂ ; 0.01% gelatine and 5 mM dNTPs, b) reaction time and temperatures: 3 minutes at 92° C. for first denaturation, then 25 to 30 temperature cycles with: 2 minutes at 92° C. for denaturation, 2 minutes at 50° C. for the annealing of the oligonucleotides, and 2.5 minutes at 72° C. for the amplification of the DNA, as well as a final 7 minutes at 72° C., to achieve complete synthesis of the last synthesis products.

The thus produced amplification products then were cloned. To this purpose, protruding ends of single-stranded DNA of the PCR products was filled in with Klenow-Polymerase, and subsequently phosphorylated with polynucleotide kinase (Sambrook et al., A Laboratory Manual, 2nd edn., (1989)).

The purification of the PCR products was performed according to standard protocols as described in Sambrook (see above) by agarose gel electrophoresis, gel elution, extraction with phenol/chloroform, and subsequent precipitation with isopropanol. The in this way obtained purified DNA was ligated into SmaI cleaved pBluescript-vector-DNA, and cloned.

Afterwards, the cloned PCR-fragment was sequenced by the method of Sanger et al. Proc. Natl. Acad. Sci. 74, p 5463-5467. The DNA sequencing took place partially radioactive by using the Sequencing-Kit, partially by using the Pharmacia Automated Laser Fluorescent A.L.F.--DNA Sequencing Automate. The sequences were analyzed with computer software of the University of Wisconsin Genetics Computer Group (Devereux et al., Nucl. Acids Res. 12, p. 387-395).

As can be seen from the sequence of the 530 bp acyl- ACP!-thioesterase-PCR-product PCR42, which is shown as SEQ ID NO:8 in the sequence listing, a PCR product has been synthesized which shows significant homology to the starting sequence. The corresponding amino acid is shown below the DNA sequence.

The above described 530 bp PCR product was used as a probe for the isolation of acyl- ACP!-thioesterase-cDNAs.

To this purpose the CDNA library described above was screened with the PCR product and 11 cDNAs were isolated, which could be attributed to three classes based on their sequences.

In this context, the cDNA clones ClTE13, ClTE5, and ClTE12 were isolated , each of which represents one of the three classes. Their DNA sequences as well as the deduced amino acid sequences are presented as SEQ ID NO: 1, 2 and 3 in the sequence listing.

Example 2

Production of Genomic Clones of Acyl- ACP!-Thioesterase from Cuphea lanceolata

To this purpose genomic DNA from young leaves from Cuphea lanceolata was isolated (S. L. Della Porta, J. Wood and J. B. Hicks, A plant DNA minipreparation: Version 11, Plant. Mol. Biol. Rep. 1, p 19-21 (1983)). Then the DNA was partially cleaved with the restriction enzyme Sau3A, after which the DNA fragments in the size range between 11000 bp and 19000 bp were cloned into the XhoI cleaved vector FIX II (Stratagene), which was done after the participating cleavage sites were partially filled each with two nucleotides. The non-amplified genomic DNA bank represented 5.4 times the genome of Cuphea lanceolata. From this bank, 102 hybridizing phages were isolated with ClTE5-cDNA as a probe. 40 of these were further purified and 23 were mapped. These allow themselves to be partitioned into four classes. Refer to FIG. 2, which shows the genomic clones ClTEg1, ClTEg16, ClTEg4 and ClTEg7, which can be attributed to different classes. Appropriate DNA-fragments of the genomic clones were sequenced. Their DNA sequences as well as the deduced amino acid sequences are presented as SEQ ID NO:4, 5, 6 and 7 in the sequence listing.

Example 3

Transformation of Rape and Tobacco

Appropriate expression vectors were constructed. To this purpose, a chimeric gene, consisting of the ClTE13-cDNA, the promotor ACP23 and the terminator tnos, was inserted into the binary vector pRE1. The resulting vector is pNBM99-2TE. The vector pNBM99-3TE was produced by introduction of the ClTE13-cDNA after the constructed double promotor of the 35S RNA from cauliflower mosaic virus (p35S/dp35S) into the binary vector pRE9. Further expression vectors were produced using the genomic clones ClTEg1 and ClTEg16 and the binary vector pRE I. The thus obtained expression vectors were designated pNBM99-TEg1 and pNBM99-TEg16.

The transformation of rape was carried out with Agrobacterium tumefaciens following the protocol of DeBlock et al., Plant Physiol. 91, p. 694-701, starting with hypocotyl pieces. The Agrobacterium strain GV3101 C58C 1 Rifr (Van Larebeke et al., Nature 252, p. 169-170 (1974)) was used with the Ti-plasmid pMP90RK (C. Koncz, J. Schell, Mol. Gen. Genet. 204, p. 383-396 (1986)) and the above named expression vectors. The selection for kanamycin resistance was carried out with 50 μg (Medium A5), later with 15 μg kanamycin (Monosulphate, Sigma K-4000) per milliliter medium (Medium A6 and A8). The transformation rate was 10%, with respect to the number of hypocotyl pieces that were laid out. This number is based on the verification of transformation by Southern-blot analysis (Sambrook et al., supra) and (PCR-Edwards et al., Nucl. Acids Res. 19, p. 1349 (1991)).

The transformation of tobacco was carried out with the above named vector system according to the "Leaf-disc" transformation procedure (R. B. Horsch et al., Pant (notice by translator: probably ought to be: Plant) Mol. Biol. 20, p. 1229-1231 (1985)).

The analysis of fatty acids in the transformed plants was carried out according to the method by W. Thies, Z. Pflanzenzuechtung 65, p. 181-202 (1971) and W. Thies, Proc. 4th Int. Rape Seed Con., 4.-8. June, Giessen, p. 275-282 (1974) with a Hewlett-Packard gas chromatograph (Model HP5890 Series II with FID) and a 10 m long capillary column (FS-FFAP-CB-0.25 from CS-Chromatographie-Service GmbH, Langerwehe). The separation of fatty acid methylesters took place in a temperature gradient from 140° C. to 208° C. with hydrogen as carrier gas, with a temperature rising at 20° C. per minute. After reaching the final temperature of 208° C., the separation proceeded isothermally for seven minutes. Injector and detector were held at a constant 250° C.

In any case, should a molecular biological procedure not have been adequately described, it was carried out following standard methods, as described by Sambrook et al., supra.

    __________________________________________________________________________     #             SEQUENCE LISTING     - (1) GENERAL INFORMATION:     -    (iii) NUMBER OF SEQUENCES: 14     - (2) INFORMATION FOR SEQ ID NO: 1:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 1494 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: c-DNA to m-RNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: c-DNA Bank - # ZAP               (B) CLONE: ClTE13     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 83..1324     -     (ix) FEATURE:               (A) NAME/KEY: Transit-Pept - #ide               (B) LOCATION: 83..415     -     (ix) FEATURE:               (A) NAME/KEY: mature Pr - #otein               (B) LOCATION: 416..1324     -     (ix) FEATURE:               (A) NAME/KEY: Startcodon               (B) LOCATION: 83..85     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 1325..1327     #1:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GGGCCCCCTC GTGCCGCTCG TGCCGTTTTT TTGTCGCCAT TCGCCTCTCC TC - #TCCTCTCC       60     - TCTCCTCTTC AGTTGGAAAA CA ATG GTG GCC ACC GCT GCA - # AGT TCT GCA TTC      112     #       Met Val Ala Thr Ala Ala S - #er Ser Ala Phe     #      10     - TTC CCC CTG CCG TCC CCG GAC ACC TCC TCT AG - #G CCG GGA AAG CTC GGA      160     Phe Pro Leu Pro Ser Pro Asp Thr Ser Ser Ar - #g Pro Gly Lys Leu Gly     #                 25     - AAT GGG TCA TCG AGC TTG AGC CCC CTC AAG CC - #C AAA TTT GTC GCC AAT      208     Asn Gly Ser Ser Ser Leu Ser Pro Leu Lys Pr - #o Lys Phe Val Ala Asn     #             40     - GCC GGG TTG AAG GTT AAG GCA AGC GCC AGT GC - #C CCT CCT AAG ATC AAT      256     Ala Gly Leu Lys Val Lys Ala Ser Ala Ser Al - #a Pro Pro Lys Ile Asn     #         55     - GGT TCC TCG GTC GGT CTA AAG TCC GGC AGT CT - #C AAG ACT CAG GAA GAT      304     Gly Ser Ser Val Gly Leu Lys Ser Gly Ser Le - #u Lys Thr Gln Glu Asp     #     70     - ACT CCT TCG GTG CCT CCT CCG CGG ACG TTT AT - #C AAC CAG TTG CCT GAT      352     Thr Pro Ser Val Pro Pro Pro Arg Thr Phe Il - #e Asn Gln Leu Pro Asp     # 90     - TGG AGT ATG CTT CTT GCT GCA ATC ACT ACT GT - #C TTC TTG GCA GCA GAG      400     Trp Ser Met Leu Leu Ala Ala Ile Thr Thr Va - #l Phe Leu Ala Ala Glu     #                105     - AAG CAG TGG ATG ATG CTT GAC TGG AAA CCT AA - #G AGG CCT GAC ATG CTT      448     Lys Gln Trp Met Met Leu Asp Trp Lys Pro Ly - #s Arg Pro Asp Met Leu     #           120     - GTG GAC CCG TTC GGA TTG GGA AGT ATT GTC CA - #G GGT GGG CTT GTG TTC      496     Val Asp Pro Phe Gly Leu Gly Ser Ile Val Gl - #n Gly Gly Leu Val Phe     #       135     - AGG CAA AAT TTT TCT ATT AGG TCC TAT GAA AT - #A GGC GCT GAT CGC ACT      544     Arg Gln Asn Phe Ser Ile Arg Ser Tyr Glu Il - #e Gly Ala Asp Arg Thr     #   150     - GCA TCT ATA GAG ACG GTG ATG AAC CAC TTG CA - #G GAA ACG GCT CTC AAT      592     Ala Ser Ile Glu Thr Val Met Asn His Leu Gl - #n Glu Thr Ala Leu Asn     155                 1 - #60                 1 - #65                 1 -     #70     - CAT GTT AAG AGT GCT GGG CTT CTT AAT GAC GG - #C TTT GGT CGT ACT CCT      640     His Val Lys Ser Ala Gly Leu Leu Asn Asp Gl - #y Phe Gly Arg Thr Pro     #               185     - GAG ATG TTT AAA AGG GAC CTC ATT TGG GTT GT - #C GCG AAA ATG CAG GTC      688     Glu Met Phe Lys Arg Asp Leu Ile Trp Val Va - #l Ala Lys Met Gln Val     #           200     - ATG GTT AAC CGC TAT CCT ACT TGG GGT GAC AC - #G GTT GAA GTG AAT ACT      736     Met Val Asn Arg Tyr Pro Thr Trp Gly Asp Th - #r Val Glu Val Asn Thr     #       215     - TGG GTT GCC AAG TCA GGG AAA AAT GGT ATG CG - #T CGT GAT TGG CTC ATA      784     Trp Val Ala Lys Ser Gly Lys Asn Gly Met Ar - #g Arg Asp Trp Leu Ile     #   230     - AGT GAT TGC AAT ACA GGA GAA ATT CTA ACT AG - #A GCT TCA AGC GTG TGG      832     Ser Asp Cys Asn Thr Gly Glu Ile Leu Thr Ar - #g Ala Ser Ser Val Trp     235                 2 - #40                 2 - #45                 2 -     #50     - GTC ATG ATG AAT CAA AAG ACA AGA AAA TTG TC - #A AAA ATT CCA GAT GAG      880     Val Met Met Asn Gln Lys Thr Arg Lys Leu Se - #r Lys Ile Pro Asp Glu     #               265     - GTT CGA CAT GAG ATA GAG CCT CAT TTT ATA GA - #C TGT GCT CCC GTC ATT      928     Val Arg His Glu Ile Glu Pro His Phe Ile As - #p Cys Ala Pro Val Ile     #           280     - GAA GAC GAT GAC CGG AAA CTC CGC AAG CTG GA - #T GAG AAG ACT GCT GAC      976     Glu Asp Asp Asp Arg Lys Leu Arg Lys Leu As - #p Glu Lys Thr Ala Asp     #       295     - TCC ATC CGC AAG GGT CTA ACT CCG AAG TGG AA - #T GAC TTG GAT GTC AAT     1024     Ser Ile Arg Lys Gly Leu Thr Pro Lys Trp As - #n Asp Leu Asp Val Asn     #   310     - CAG CAT GTC AAC AAC GTG AAG TAC ATC GGG TG - #G ATT CTC GAG AGT ACT     1072     Gln His Val Asn Asn Val Lys Tyr Ile Gly Tr - #p Ile Leu Glu Ser Thr     315                 3 - #20                 3 - #25                 3 -     #30     - CCA CAA GAA GTT CTG GAG ACC CAA GAG TTA TC - #T TCC CTT ACC CTG GAA     1120     Pro Gln Glu Val Leu Glu Thr Gln Glu Leu Se - #r Ser Leu Thr Leu Glu     #               345     - TAC AGG CGG GAA TGC GGA AGG GAG AGT GTG CT - #G GAG TCC CTC ACT GCT     1168     Tyr Arg Arg Glu Cys Gly Arg Glu Ser Val Le - #u Glu Ser Leu Thr Ala     #           360     - GTG GAC TCC TCT GGA AAG GGC TTT GGG TCC CA - #G TTC CAA CAC CTT CTG     1216     Val Asp Ser Ser Gly Lys Gly Phe Gly Ser Gl - #n Phe Gln His Leu Leu     #       375     - AGG CTT GAG GAT GGA GGT GAG ATC GTG AAG GG - #G AGA ACT GAG TGG CGA     1264     Arg Leu Glu Asp Gly Gly Glu Ile Val Lys Gl - #y Arg Thr Glu Trp Arg     #   390     - CCC AAG ACT GCA GGT GTC AAT GGG GCA ATA GC - #A TCC GGG GAG ACC TCA     1312     Pro Lys Thr Ala Gly Val Asn Gly Ala Ile Al - #a Ser Gly Glu Thr Ser     395                 4 - #00                 4 - #05                 4 -     #10     - CAT GGA GAC TCT TAGAAGGGAG CCCCGGTCCC TTTCGAGTTC TG - #CTTTCTTT     1364     His Gly Asp Ser     - ATTGTCGGAT GAGCTGAGTG AACGGCAGGT AAGGTAGTAG CAATCAGTGG AT - #TGTGTAGT     1424     - TTATTTGCTG TTTTTCACTT CGGCTCTCTT GTATAAAAAA AAAAAAAAAA AA - #AAAACTCG     1484     #      1494     - (2) INFORMATION FOR SEQ ID NO: 2:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 1404 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: c-DNA to m-RNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: c-DNA Bank - # ZAP               (B) CLONE: ClTE5     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 15..1139     -     (ix) FEATURE:               (A) NAME/KEY: Transit-Pept - #ide               (B) LOCATION: 15..245     -     (ix) FEATURE:               (A) NAME/KEY: mature Pr - #otein               (B) LOCATION: 246..1139     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 1140..1142     #2:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GAATTCGGCA CGAG CTC AAG CCC AAA TCC ATC CCC A - #AT GGC GGT TTG CAA       50                     Leu L - #ys Pro Lys Ser Ile Pro Asn Gly Gly Leu Gl - #n     #               10     - GTT AAG GCA AGC GCC AGT GCC CCT CCT AAG AT - #C AAT GGT TCC TCG GTC       98     Val Lys Ala Ser Ala Ser Ala Pro Pro Lys Il - #e Asn Gly Ser Ser Val     #         25     - GGT CTA AAG TCG GGC GGT CTC AAG ACT CAT GA - #C GAC GCC CCT TCG GCC      146     Gly Leu Lys Ser Gly Gly Leu Lys Thr His As - #p Asp Ala Pro Ser Ala     #     40     - CCT CCT CCC CGG ACT TTT ATC AAC CAG TTA CC - #T GAT TGG AGT ATG CTT      194     Pro Pro Pro Arg Thr Phe Ile Asn Gln Leu Pr - #o Asp Trp Ser Met Leu     # 60     - CTT GCT GCA ATC ACT ACT GCC TTC TTG GCA GC - #A GAG AAG CAG TGG ATG      242     Leu Ala Ala Ile Thr Thr Ala Phe Leu Ala Al - #a Glu Lys Gln Trp Met     #                 75     - ATG CTT GAT TGG AAA CCG AAG AGG CTT GAC AT - #G CTT GAG GAC CCG TTC      290     Met Leu Asp Trp Lys Pro Lys Arg Leu Asp Me - #t Leu Glu Asp Pro Phe     #             90     - GGA TTG GGA AGG ATT GTT CAG GAT GGG CTT GT - #G TTC AGG CAG AAT TTT      338     Gly Leu Gly Arg Ile Val Gln Asp Gly Leu Va - #l Phe Arg Gln Asn Phe     #        105     - TCG ATT AGG TCC TAC GAA ATA GGC GCC GAT CG - #C ACT GCG TCT ATT GAG      386     Ser Ile Arg Ser Tyr Glu Ile Gly Ala Asp Ar - #g Thr Ala Ser Ile Glu     #   120     - ACG GTG ATG AAT CAC TTG CAG GAA ACA GCT CT - #C AAT CAT GTT AAG ACT      434     Thr Val Met Asn His Leu Gln Glu Thr Ala Le - #u Asn His Val Lys Thr     125                 1 - #30                 1 - #35                 1 -     #40     - GCT GGG CTT TCT AAT GAC GGC TTT GGT CGT AC - #T CCT GAG ATG TAT AAA      482     Ala Gly Leu Ser Asn Asp Gly Phe Gly Arg Th - #r Pro Glu Met Tyr Lys     #               155     - AGG GAC CTT ATT TGG GTT GTT GCG AAA ATG CA - #G GTC ATG GTT AAC CGC      530     Arg Asp Leu Ile Trp Val Val Ala Lys Met Gl - #n Val Met Val Asn Arg     #           170     - TAT CCT ACT TGG GGT GAC ACA GTT GAA GTG AA - #T ACT TGG GTT GCC AAG      578     Tyr Pro Thr Trp Gly Asp Thr Val Glu Val As - #n Thr Trp Val Ala Lys     #       185     - TCA GGG AAA AAT GGT ATG CGT CGT GAC TGG CT - #C ATA AGT GAT TGC AAT      626     Ser Gly Lys Asn Gly Met Arg Arg Asp Trp Le - #u Ile Ser Asp Cys Asn     #   200     - ACA GGA GAG ATT CTT ACA AGA GCA TCA AGC GT - #G TGG GTA ATG ATG AAT      674     Thr Gly Glu Ile Leu Thr Arg Ala Ser Ser Va - #l Trp Val Met Met Asn     205                 2 - #10                 2 - #15                 2 -     #20     - CAA AAG ACA AGA AAA TTG TCA AAA ATT CCA GA - #T GAG GTT CGA CGT GAG      722     Gln Lys Thr Arg Lys Leu Ser Lys Ile Pro As - #p Glu Val Arg Arg Glu     #               235     - ATA GAG CCT CAT TTT GTG GAC TCT GCT CCC GT - #C ATT GAA GAC GAT GAC      770     Ile Glu Pro His Phe Val Asp Ser Ala Pro Va - #l Ile Glu Asp Asp Asp     #           250     - CGG AAA CTT CCC AAG CTG GAT GAG AAG AGT GC - #T GAC TCC ATC CGC AAG      818     Arg Lys Leu Pro Lys Leu Asp Glu Lys Ser Al - #a Asp Ser Ile Arg Lys     #       265     - GGT CTA ACT CCG AGG TGG AAT GAT TTG GAT GT - #C AAT CAG CAC GTC AAC      866     Gly Leu Thr Pro Arg Trp Asn Asp Leu Asp Va - #l Asn Gln His Val Asn     #   280     - AAC GTG AAG TAC ATC GGG TGG ATT CTT GAG AG - #T ACT CCA CCA GAA GTT      914     Asn Val Lys Tyr Ile Gly Trp Ile Leu Glu Se - #r Thr Pro Pro Glu Val     285                 2 - #90                 2 - #95                 3 -     #00     - CTG GAG ACC CAG GAG TTA TGT TCC CTT ACC CT - #G GAA TAC AGG CGG GAA      962     Leu Glu Thr Gln Glu Leu Cys Ser Leu Thr Le - #u Glu Tyr Arg Arg Glu     #               315     - TGT GGA AGG GAG AGC GTG CTG GAG TCC CTC AC - #T GCT GTG GAC CCC TCT     1010     Cys Gly Arg Glu Ser Val Leu Glu Ser Leu Th - #r Ala Val Asp Pro Ser     #           330     - GGA GAG GGC TAT GGA TCC CAG TTT CAG CAC CT - #T CTG CGG CTT GAG GAT     1058     Gly Glu Gly Tyr Gly Ser Gln Phe Gln His Le - #u Leu Arg Leu Glu Asp     #       345     - GGA GGT GAG ATC GTG AAG GGG AGA ACT GAG TG - #G CGA CCA AAG AAT GCT     1106     Gly Gly Glu Ile Val Lys Gly Arg Thr Glu Tr - #p Arg Pro Lys Asn Ala     #   360     - GGA ATC AAT GGG GGG GTA CCG TCC GAG GAG TC - #C TAACCTGGAG ACTACTCTTA     1159     Gly Ile Asn Gly Gly Val Pro Ser Glu Glu Se - #r     365                 3 - #70                 3 - #75     - GAAGGAGGAG CCCTGGGCTG GCCCCTTTGG AGTTATGCTT TCTTTTATTG TG - #GGATGAGC     1219     - TGAGTGAAGG GCAGGTAAGA TTAAGATAGT AGCAATCGGG AGATTGTGTA GT - #TTGTTTGC     1279     - TGCTTTTCAC TTTGGCTCTC TTGTATAATA TCATGGTCGT CGTCTTTGTA TC - #CTCGCATG     1339     - GTCCGGTTTG ATTTATACAT TATATTCTTT CTATTTGTTT CAAAAAAAAA AA - #AAAAAAAC     1399     #          1404     - (2) INFORMATION FOR SEQ ID NO: 3:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 1066 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: c-DNA to m-RNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: c-DNA Bank - # ZAP               (B) CLONE: ClTE12     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 15..875     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 876..878     #3:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GAATTCGGCA CGAG GTT CGG GAT GGG CTC GTG TCC A - #GA CAG AGT TTT TTG       50                     Val A - #rg Asp Gly Leu Val Ser Arg Gln Ser Phe Le - #u     #               10     - ATT AGA TCT TAT GAA ATA GGC GCT GAT CGA AC - #A GCC TCT ATA GAG ACG       98     Ile Arg Ser Tyr Glu Ile Gly Ala Asp Arg Th - #r Ala Ser Ile Glu Thr     #         25     - CTG ATG AAC CAC TTG CAG GAA ACA TCT ATC AA - #T CAT TGT AAG AGT TTG      146     Leu Met Asn His Leu Gln Glu Thr Ser Ile As - #n His Cys Lys Ser Leu     #     40     - GGT CTT CTC AAT GAC GGC TTT GGT CGT ACT CC - #T GGG ATG TGT AAA AAC      194     Gly Leu Leu Asn Asp Gly Phe Gly Arg Thr Pr - #o Gly Met Cys Lys Asn     # 60     - GAC CTC ATT TGG GTG CTT ACA AAA ATG CAG AT - #C ATG GTG AAT CGC TAC      242     Asp Leu Ile Trp Val Leu Thr Lys Met Gln Il - #e Met Val Asn Arg Tyr     #                 75     - CCA ACT TGG GGC GAT ACT GTT GAG ATC AAT AC - #C TGG TTC TCT CAG TCG      290     Pro Thr Trp Gly Asp Thr Val Glu Ile Asn Th - #r Trp Phe Ser Gln Ser     #             90     - GGG AAA ATC GGT ATG GCT AGC GAT TGG CTA AT - #A AGT GAT TGC AAC ACA      338     Gly Lys Ile Gly Met Ala Ser Asp Trp Leu Il - #e Ser Asp Cys Asn Thr     #        105     - GGA GAA ATT CTT ATA AGA GCA ACG AGC GTG TG - #G GCT ATG ATG AAT CAA      386     Gly Glu Ile Leu Ile Arg Ala Thr Ser Val Tr - #p Ala Met Met Asn Gln     #   120     - AAG ACG AGA AGA TTC TCA AGA CTT CCA TAC GA - #G GTT CGC CAG GAG TTA      434     Lys Thr Arg Arg Phe Ser Arg Leu Pro Tyr Gl - #u Val Arg Gln Glu Leu     125                 1 - #30                 1 - #35                 1 -     #40     - ACA CCT CAT TTT GTG GAC TCT CCT CAT GTC AT - #T GAA GAC AAT GAT CAG      482     Thr Pro His Phe Val Asp Ser Pro His Val Il - #e Glu Asp Asn Asp Gln     #               155     - AAA TTG CAT AAG TTT GAT GTG AAG ACT GGT GA - #T TCC ATT CGC AAG GGT      530     Lys Leu His Lys Phe Asp Val Lys Thr Gly As - #p Ser Ile Arg Lys Gly     #           170     - CTA ACT CCG AGG TGG AAT GAC TTG GAT GTG AA - #T CAG CAC GTA AGC AAC      578     Leu Thr Pro Arg Trp Asn Asp Leu Asp Val As - #n Gln His Val Ser Asn     #       185     - GTG AAG TAC ATT GGG TGG ATT CTC GAG AGT AT - #G CCA ATA GAA GTT TTG      626     Val Lys Tyr Ile Gly Trp Ile Leu Glu Ser Me - #t Pro Ile Glu Val Leu     #   200     - GAG ACC CAG GAG CTA TGC TCT CTC ACC GTT GA - #A TAT AGG CGG GAA TGC      674     Glu Thr Gln Glu Leu Cys Ser Leu Thr Val Gl - #u Tyr Arg Arg Glu Cys     205                 2 - #10                 2 - #15                 2 -     #20     - GGA ATG GAC AGT GTG CTG GAG TCC GTG ACT GC - #T GTG GAT CCC TCA GAA      722     Gly Met Asp Ser Val Leu Glu Ser Val Thr Al - #a Val Asp Pro Ser Glu     #               235     - AAT GGA GGC CGG TCT CAG TAC AAG CAC CTT TT - #G CGG CTT GAG GAT GGG      770     Asn Gly Gly Arg Ser Gln Tyr Lys His Leu Le - #u Arg Leu Glu Asp Gly     #           250     - ACT GAT ATC GTG AAG AGC AGA ACT GAG TGG CG - #A CCG AAG AAT GCA GGA      818     Thr Asp Ile Val Lys Ser Arg Thr Glu Trp Ar - #g Pro Lys Asn Ala Gly     #       265     - ACT AAC GGG GCG ATA TCA ACA TCA ACA GCA AA - #G ACT TCA AAT GGA AAC      866     Thr Asn Gly Ala Ile Ser Thr Ser Thr Ala Ly - #s Thr Ser Asn Gly Asn     #   280     - TCG GCC TCT TAGAAGAGTC TCGGGACCCT TCCAAGATGT GCATTTCTT - #T      915     Ser Ala Ser     285     - TCTCTTTCTC ATTGTCTGCT GAGCTGAAAG AAGAGCATGT GGTTGCAATC AG - #TAAATTGT      975     - GTAGTTCGCT TTGCTTCGCT CCTTTGTATA ATAACATGGT CAGTCGTCTT TG - #TATCAAAA     1035     #        1066      AAAA AAAAACTCGA G     - (2) INFORMATION FOR SEQ ID NO: 4:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 4098 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: DNS (genomic)     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: genomic La - #mbda FIX II               (B) CLONE: ClTEg1     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: join(1797..2 - #294, 2658..2791, 2898..3011,     3132                    ..3303, 3 - #391..3459, 3672..3941)     -     (ix) FEATURE:               (A) NAME/KEY: Startcodon               (B) LOCATION: 1797..1799     -     (ix) FEATURE:               (A) NAME/KEY: exon II               (B) LOCATION: 1787..2294     -     (ix) FEATURE:               (A) NAME/KEY: intron II               (B) LOCATION: 2295..2657     -     (ix) FEATURE:               (A) NAME/KEY: exon III               (B) LOCATION: 2658..2791     -     (ix) FEATURE:               (A) NAME/KEY: intron II - #I               (B) LOCATION: 2792..2897     -     (ix) FEATURE:               (A) NAME/KEY: exon IV               (B) LOCATION: 2898..3011     -     (ix) FEATURE:               (A) NAME/KEY: intron IV               (B) LOCATION: 3012..3131     -     (ix) FEATURE:               (A) NAME/KEY: exon V               (B) LOCATION: 3132..3303     -     (ix) FEATURE:               (A) NAME/KEY: intron V               (B) LOCATION: 3304..3390     -     (ix) FEATURE:               (A) NAME/KEY: exon VI               (B) LOCATION: 3391..3459     -     (ix) FEATURE:               (A) NAME/KEY: intron VI               (B) LOCATION: 3460..3671     -     (ix) FEATURE:               (A) NAME/KEY: exon VII               (B) LOCATION: 3672..3941     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 3942..3944     #4:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - CACCCATAAG AACCCAAAAG TCTGAAATAC AGTCAAAACC CGTAAAATTT TG - #ATATATTA       60     - TCGAATATTT TGGGATATTT GGTCCTTATG AGTGTTCGAG GGATATTTCA AA - #TTTTACGA      120     - ATATTCGGGA ATATTTCGCT ATTTAAAATT TTGCGGGATA TATTTGTAAT AT - #TTTATGAA      180     - TTATTGAAAT ATTTTTTGAA ATTTTAAAAT ATTTTTTAAA ATTTAAATAT AT - #TTTAAATT      240     - CTTTTAAAAA AAATATTTTT AAATATTATA AAATTAGTTT TTAAAATTTT TT - #AAATATTT      300     - TAAAATTAGT TTTTTTTATT TTTAAAATAT TGTTGAATTT TTAAAATATT TT - #TTGGTTTT      360     - AAAAATATAT TTAAAAGTTT TTAAATATTT TTTGAATTTT TGAAATATTG AA - #AAAATTTT      420     - GTTGGAGATA ACCGGAGAAT TTATATATAT ATATATATAT ATATATATAT AT - #ATATATTT      480     - CGTCCATTTC GGTTAAACCA AACGTAGTTC GTAACAGAAT GATAAACGTG AT - #CTATGGAA      540     - TGAAAGTTTA AGAGCAAACG AAGCTATTAT TTTAATTTAA AGACAAAAGT AG - #TGACAATT      600     - TATACTTTTA AGGCAAGTTT GACCGTTAAG TCTATTTTTT ATATTGACGG GA - #CGTGGCCA      660     - TGTAATTGGT TACTTTGTCG ATGTATGCCA TGTAAGAATC ATACGCCAAC GT - #TCGTTAAC      720     - GCCATTAACC ATACGTCATG TAAGAATATA CGTTCATTAG AAGGAACATG AA - #AGAAAGGG      780     - TACATATTCG ATCTATATAC CGATCTATAT ACCATAGTAT TCCATATAAA TA - #CCTTATTT      840     - AGAAATACCA TATTATATAG ATATCAACGT CATTAATAAA AAATAGAAGG TT - #GGACCCTG      900     - CATGTTACGA AATATAATGA GTTATATTTT AAATTTTGCT TTTGGATAAG TG - #ATCCCGAA      960     - AATAAGTGGA CGAAGTAATT AACCCAAATT TTTAAGCTCA AACTGATACA GT - #TGGATTCA     1020     - TAGTTGAGGA AATGAAAACA GCTGAAGATC GCAAAGTTTC CATTGCCATA CT - #CATACCTC     1080     - TTCATTCAGC TATGTCCCTT CCCTTGGCTT CCTATTTAAG CTGTTGTTTG TG - #TATGTCGC     1140     - CATTTGGCCC CTCCCTCCCC TCCTCTTCAG GTATACCCAC GGCCCTCATC AT - #TCTCTCAC     1200     - TACGTGTCTG TGTTTCCATC CCATTCCCCG CCCCGTCTCC TTTCCTTCCT TC - #ACGGGACT     1260     - TTGCTTTTGC ATACCCAGTG AACTGAACCC ACCCACCCCC AGTCACCCAG TT - #GTCATCTT     1320     - TTTTCTGCAA AGCCTCTCTG CTTTCTTCGT TTACCGTCGT CCTGAGCCCA TA - #GAAAAGTT     1380     - TGCCCATTTC CTCCTCGTGT TGATCGACCT CATGTCCCGT TTCTTGCCAA AT - #GTGCGGCC     1440     - CTTCTTCTCC TGCCCACTTT CTGTTTTTTA ATGTTATGCT CCGAGCCACG TT - #TCTTTGAT     1500     - TCTCTGTTCT CCTCACGGCG CCTTCCGGGC CACCGTCACT GTCCCCCTTC TT - #TATATGGC     1560     - TTCCGTTTTC CTTCGTTGCT GGATATCCCA TCCCATGTTC ATCTGAGTTT GC - #TGTCTACC     1620     - ATTTTCCCTG TATGTTATTT CCATGCATGC ATGCATGTCT ATGGCTTCCT TG - #TAGAAATG     1680     - TGTTGTGTTT TGTTATAAAG CTTCCATCTT TCCCTTCTGT TTGAATCCGA GG - #TTGTCGTT     1740     - TTAATGCAAT TAAAGCTTCT GCTAACTGAC CCTCTTGTGT TTACAGGCGA AG - #AAAC     1796     - ATG GTG GCT GCT GCA GCA ACT TCT GCA TTC TT - #C CCT GTT CCA GCC CCG     1844     Met Val Ala Ala Ala Ala Thr Ser Ala Phe Ph - #e Pro Val Pro Ala Pro     #                 15     - GGA ACC TCC CCT AAA CCC GGG AAG TCC GGC AA - #C TGG CCA TCG AGC TTG     1892     Gly Thr Ser Pro Lys Pro Gly Lys Ser Gly As - #n Trp Pro Ser Ser Leu     #             30     - AGC CCT ACC TTC AAG CCC AAG TCA ATC CCC AA - #T GCT GGA TTT CAG GTT     1940     Ser Pro Thr Phe Lys Pro Lys Ser Ile Pro As - #n Ala Gly Phe Gln Val     #         45     - AAG GCA AAT GCC AGT GCC CAT CCT AAG GCT AA - #C GGT TCT GCA GTA AAT     1988     Lys Ala Asn Ala Ser Ala His Pro Lys Ala As - #n Gly Ser Ala Val Asn     #     60     - CTA AAG TCT GGC AGC CTC AAC ACT CAG GAG GA - #C ACT TCG TCG TCC CCT     2036     Leu Lys Ser Gly Ser Leu Asn Thr Gln Glu As - #p Thr Ser Ser Ser Pro     # 80     - CCT CCC CGG GCT TTC CTT AAC CAG TTG CCT GA - #T TGG AGT ATG CTT CTG     2084     Pro Pro Arg Ala Phe Leu Asn Gln Leu Pro As - #p Trp Ser Met Leu Leu     #                 95     - ACT GCA ATC ACG ACC GTC TTC GTG GCG GCA GA - #G AAG CAG TGG ACT ATG     2132     Thr Ala Ile Thr Thr Val Phe Val Ala Ala Gl - #u Lys Gln Trp Thr Met     #           110     - CTT GAT AGG AAA TCT AAG AGG CCT GAC ATG CT - #C GTG GAC TCG GTT GGG     2180     Leu Asp Arg Lys Ser Lys Arg Pro Asp Met Le - #u Val Asp Ser Val Gly     #       125     - TTG AAG AGT ATT GTT CGG GAT GGG CTC GTG TC - #C AGA CAG AGT TTT TTG     2228     Leu Lys Ser Ile Val Arg Asp Gly Leu Val Se - #r Arg Gln Ser Phe Leu     #   140     - ATT AGA TCT TAT GAA ATA GGC GCT GAT CGA AC - #A GCC TCT ATA GAG ACG     2276     Ile Arg Ser Tyr Glu Ile Gly Ala Asp Arg Th - #r Ala Ser Ile Glu Thr     145                 1 - #50                 1 - #55                 1 -     #60     - CTG ATG AAC CAC TTG CAG GTACTGCTTT GAAACTATTC AT - #TCATCGCA     2324     Leu Met Asn His Leu Gln                     165     - TATGCTAGTG ATCAGTAAAT GAGCCATGAC TAGATGATGA AATAGATAAC AC - #CGATTGCC     2384     - GGTACAACGA GCTAATTGTT CCATTTTAAT TTAGAAGTGC TCTTTTCTTG TT - #CATGACGA     2444     - GGTTGGTATC CCAGGGTGAG ATTTGTCAGG TTGATTCAAT GAAAGGGCTA AT - #TTTCGACG     2504     - CGTACTATGA TAGTTTTAAT GCTCTCATTC GAACTTGAAA TGACTAAGCA TT - #CTGATGAG     2564     - AAGTATTTAA TCTAAAATGC TTGCATTAGT TTTGCTTATA TTTTCTCGTT AA - #CTCGGTTG     2624     - TCTTTATTCT TGTTTTTTTT TTTCTCTTAA CAG GAA ACA TCT AT - #C AAT CAT TGT     2678     # Glu Thr Ser Ile Asn His Cys     #             170     - AAG AGT TTG GGT CTT CTC AAT GAC GGC TTT GG - #T CGT ACT CCT GGG ATG     2726     Lys Ser Leu Gly Leu Leu Asn Asp Gly Phe Gl - #y Arg Thr Pro Gly Met     #   185     - TGT AAA AAC GAC CTC ATT TGG GTG CTT ACA AA - #A ATG CAG ATC ATG GTG     2774     Cys Lys Asn Asp Leu Ile Trp Val Leu Thr Ly - #s Met Gln Ile Met Val     190                 1 - #95                 2 - #00                 2 -     #05     - AAT CGC TAC CCA ACT  TG  GTAAGTTTGT CACTGGCTG - #G TTTGTCTTTT     2821     Asn Arg Tyr Pro Thr  Trp                     210     - GGTCCGTAAG TGCCTCTTAC AATACTAGTT GTAAACATAG TGGAATGTAA TG - #GCCTGTAT     2881     - GTGATCTTTA TGGTAG G GGC GAT ACT GTT GAG ATC A - #AT ACC TGG TTC TCT     2931     #Phe Sersp Thr Val Glu Ile Asn Thr Trp     #              220     - CAG TCG GGG AAA ATC GGT ATG GCT AGC GAT TG - #G CTA ATA AGT GAT TGC     2979     Gln Ser Gly Lys Ile Gly Met Ala Ser Asp Tr - #p Leu Ile Ser Asp Cys     #       235     #3021GTATTATTAA             A AGA GCA ACG     #SerThr Gly Glu Ile Leu Ile Arg Ala Thr     #   245     - TTCTGGCTCT GAGTTTACAT TCTCAAAACC TTCTGATGCT CGATCAGTGA GC - #AGACATTT     3081     #GTG TGG    3138TGTAAAG TGGAGTCATG TCACTCTCAT ATTATCGCAG C     #   Val Trp     #   250     - GCT ATG ATG AAT CAA AAG ACG AGA AGA TTC TC - #A AGA CTT CCA TAC GAG     3186     Ala Met Met Asn Gln Lys Thr Arg Arg Phe Se - #r Arg Leu Pro Tyr Glu     #           265     - GTT CGC CAG GAG TTA ACA CCT CAT TTT GTG GA - #C TCT CCT CAT GTC ATT     3234     Val Arg Gln Glu Leu Thr Pro His Phe Val As - #p Ser Pro His Val Ile     #       280     - GAA GAC AAT GAT CAG AAA TTG CAT AAG TTT GA - #T GTG AAG ACT GGT GAT     3282     Glu Asp Asn Asp Gln Lys Leu His Lys Phe As - #p Val Lys Thr Gly Asp     #   295     - TCC ATT CGC AAG GGT CTA ACT GTAAGTCCCT ATCTTTCAC - #T GTGATATTAG     3333     Ser Ile Arg Lys Gly Leu Thr     300                 3 - #05     - GGCGGTTTTT ATGAAATATC GTGTCTCTGA GACGTTCTTC CACTTCATGG TT - #TGTAG     3390     - CCG AGG TGG AAT GAC TTG GAT GTG AAT CAG CA - #C GTA AGC AAC GTG AAG     3438     Pro Arg Trp Asn Asp Leu Asp Val Asn Gln Hi - #s Val Ser Asn Val Lys     #           320     - TAC ATT GGG TGG ATT CTC GAG GTACCCTTTT CATCGCACG - #C ACGAGAACAA     3489     Tyr Ile Gly Trp Ile Leu Glu             325     - CTGATATATT TTTTGGTTAA TGATGATAAG ATCAATAAAC TTAGATATTG AA - #TGCAAGTA     3549     - TCTGCTAGCT AGCACATGAG ATATTACTTA AATATCGTAG ACTAGTATCG CC - #CCGAGTTT     3609     - GTCAAAGCTT ACTTTAGGAT TCCGCTTTAC AGATCTTTGA TCTAGCCGAA TT - #CTTGTTGC     3669     - AG AGT ATG CCA ATA GAA GTT TTG GAG ACC CAG - # GAG CTA TGC TCT CTC     3716     #Gln Glu Leu Cys Ser Leual Leu Glu Thr     #  340     - ACC GTT GAA TAT AGG CGG GAA TGC GGA ATG GA - #C AGT GTG CTG GAG TCC     3764     Thr Val Glu Tyr Arg Arg Glu Cys Gly Met As - #p Ser Val Leu Glu Ser     345                 3 - #50                 3 - #55                 3 -     #60     - GTG ACT GCT GTG GAT CCC TCA GAA AAT GGA GG - #C CGG TCT CAG TAC AAG     3812     Val Thr Ala Val Asp Pro Ser Glu Asn Gly Gl - #y Arg Ser Gln Tyr Lys     #               375     - CAC CTT TTG CGG CTT GAG GAT GGG ACT GAT AT - #C GTG AAG AGC AGA ACT     3860     His Leu Leu Arg Leu Glu Asp Gly Thr Asp Il - #e Val Lys Ser Arg Thr     #           390     - GAG TGG CGA CCG AAG AAT GCA GGA ACT AAC GG - #G GCG ATA TCA ACA TCA     3908     Glu Trp Arg Pro Lys Asn Ala Gly Thr Asn Gl - #y Ala Ile Ser Thr Ser     #       405     - ACA GCA AAG ACT TCA AAT GGA AAC TCG GCC TC - #T TAGAAGAGTC TCGGGACCCT     3961     Thr Ala Lys Thr Ser Asn Gly Asn Ser Ala Se - #r     #   415     - TCCAAGATGT GCATTTCTTT TCTCTTTCTC ATTGTCTGCT GAGCTGAAAG AA - #GAGCATGT     4021     - GGTTGCAATC AGTAAATTGT GTAGTTCGCT TTGCTTCGCT CCTTTGTATA AT - #AACATGGT     4081     # 4098             A     - (2) INFORMATION FOR SEQ ID NO: 5:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 928 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: DNA (genomic)     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: genomic La - #mbda FIX II               (B) CLONE: ClTEg4     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 8..502     -     (ix) FEATURE:               (A) NAME/KEY: Startcodon               (B) LOCATION: 8..10     -     (ix) FEATURE:               (A) NAME/KEY: exon II               (B) LOCATION: 1..502     -     (ix) FEATURE:               (A) NAME/KEY: intron II               (B) LOCATION: 503..928     #5:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     #TTC TTC CCC GTG CCA       49 GCA AGT TCT GCA     #Ser Ser Ala Phe Phe Pro Val Pro     #       10     - TCT GCC GAC ACC TCC TCC AGA CCC GGA AAG CT - #C GGT AAT GGT CCA TCG       97     Ser Ala Asp Thr Ser Ser Arg Pro Gly Lys Le - #u Gly Asn Gly Pro Ser     # 30     - AGC TTC AGC CCC CTC AAG CCC AAA TCC ATC CC - #C AAT GGC GGT TTG CAG      145     Ser Phe Ser Pro Leu Lys Pro Lys Ser Ile Pr - #o Asn Gly Gly Leu Gln     #                 45     - GTT AAG GCA AGC GCC AGT GCC CCT CCT AAG AT - #C AAT GGT TCC TCG GTC      193     Val Lys Ala Ser Ala Ser Ala Pro Pro Lys Il - #e Asn Gly Ser Ser Val     #             60     - GGT CTA AAG TCG GGC GGT CTC AAG ACT CAT GA - #C GAC GCC CCT TCG GCC      241     Gly Leu Lys Ser Gly Gly Leu Lys Thr His As - #p Asp Ala Pro Ser Ala     #         75     - CCT CCT CCC CGG ACT TTT ATC AAC CAG TTG CC - #T GAT TGG AGT ATG CTT      289     Pro Pro Pro Arg Thr Phe Ile Asn Gln Leu Pr - #o Asp Trp Ser Met Leu     #     90     - CTT GCT GCA ATC ACT ACT GCC TTC TTG GCA GC - #A GAG AAG CAG TGG ATG      337     Leu Ala Ala Ile Thr Thr Ala Phe Leu Ala Al - #a Glu Lys Gln Trp Met     #110     - ATG CTT GAT TGG AAA CCG AAG AGG CTT GAC AT - #G CTT GAG GAC CCG TTC      385     Met Leu Asp Trp Lys Pro Lys Arg Leu Asp Me - #t Leu Glu Asp Pro Phe     #               125     - GGA TTG GGA AGG ATT GTT CAG GAT GGG CTT GT - #G TTC AGG CAG AAT TTT      433     Gly Leu Gly Arg Ile Val Gln Asp Gly Leu Va - #l Phe Arg Gln Asn Phe     #           140     - TCG ATT AGG TCC TAC GAA ATA GGC GCC GAT CG - #C ACT GCG TCT ATT GAG      481     Ser Ile Arg Ser Tyr Glu Ile Gly Ala Asp Ar - #g Thr Ala Ser Ile Glu     #       155     - ACG GTG ATG AAT CAC TTG CAG GTAATGGTGC ATGCTGCTT - #T TAAACTACTC      532     Thr Val Met Asn His Leu Gln     #   165     - AATTCATGAA ATGCTTATGG CCAGTAACTG AGCCATACTT GACTATGGCC TA - #CCCAATTT      592     - AATGGTGAAT TTGAGAAAGA GAAGGGTTGT ATTGCATGCA TTCCTTTCTC TG - #TTGTCATA      652     - GAGGTGATTC AGTATAGGTT TAACTCGTGT CAGTTTCATC GTATATGCAC TC - #TTTTCATG      712     - ATCACTTTGG TTTCTTATGG CGAGATTTGA GAATCATCTG CTCTACTTTT TT - #TTTATTAA      772     - ATTTAAGCAT TATGAAATTT TATGTGGAAC TTTTTACTTA TGTGACAGTA AA - #CCCAGGAA      832     - GGAACTCTCG TTTCATTTGA AAACTGGATT AGTTTTCTGA TATTTATGAA TT - #TCTAAAAA      892     #      928         TTAT AAGTTTGGTT GAATTC     - (2) INFORMATION FOR SEQ ID NO: 6:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 4643 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: DNA (genomic)     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY:genomic Lambda - # FIX II               (B) CLONE: ClTEg7     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: join(783..12 - #77, 1787..1920, 2120..2233, 2352                    ..2523, 2 - #615..2683, 2861..3118)     -     (ix) FEATURE:               (A) NAME/KEY: Startcodon               (B) LOCATION: 783..785     -     (ix) FEATURE:               (A) NAME/KEY: exon II               (B) LOCATION: 773..1277     -     (ix) FEATURE:               (A) NAME/KEY: intron II               (B) LOCATION: 1278..1786     -     (ix) FEATURE:               (A) NAME/KEY: exon III               (B) LOCATION: 1787..1920     -     (ix) FEATURE:               (A) NAME/KEY: intron II - #I               (B) LOCATION: 1921..2119     -     (ix) FEATURE:               (A) NAME/KEY: exon IV               (B) LOCATION: 2120..2233     -     (ix) FEATURE:               (A) NAME/KEY: intron IV               (B) LOCATION: 2234..2351     -     (ix) FEATURE:               (A) NAME/KEY: exon V               (B) LOCATION: 2352..2523     -     (ix) FEATURE:               (A) NAME/KEY: intron V               (B) LOCATION: 2524..2614     -     (ix) FEATURE:               (A) NAME/KEY: exon VI               (B) LOCATION: 2615..2683     -     (ix) FEATURE:               (A) NAME/KEY: intron VI               (B) LOCATION: 2684..2860     -     (ix) FEATURE:               (A) NAME/KEY: exon VII               (B) LOCATION: 2861..3118     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 3119..3121     #6:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GTCGACTCGA TCCTTTCCTC CCGCTCGTAA TGACCCTTTA GCCCCCTTTG CC - #TTCTTCAA       60     - ATCCTCCTTT CCTTTCCCTT CTTCCTCTCT GGGAAGCTTA AAGCTTTGTC CC - #CCACAACC      120     - TCTTTCCCGC ATTCGTTGAG CTGTTTTTTT GTCGCCATTC GCCTCTCCTC TC - #CTCTCCTC      180     - TCCTCTTCAG GTTCGCCCCT ATCTCTCTCC CTCTCTCTTG TTTCGTCTCT TT - #GCCGGATT      240     - TGCAAACCCA TTGAATCCAG CTTGAGCCAC CCAATTGGTT ATAGATCTGC AA - #AGTCCCTT      300     - TTTTCCCCCT TCTCCGGCGC CGGAGCCCGT TTAGAAGTTC CCCATTTTCC AT - #TTTTTTTT      360     - CTCTTTTTTG CTGTCGGGTT GATGTCTCCT TGTTAGATCT GCCGAATGTC AG - #GCCTTTCC      420     - TGTCGTTTTT CAATCTTCTC TGATGATTTT TGACCCAGGT TCCTTTGTTT AT - #GTGTTCTT      480     - CTTCTTTGGA TGTTTCCTTC TTATCCCATC ATCAAAGTTT CTCTTTTTTT CC - #CAATGATT      540     - GTTGGGTCTT CCATCTTATT TGATTATGTT GTTTCGATGA TATCCCATGT TT - #ATCTGCGT      600     - TTTTCGAGCG ATTTTTCGGT CGCCATTTCC CTGCATGTCG GTGGCATTGG AT - #ATTCTTGT      660     - AACAATCTGA ATGGCATGTG TTGTGGTGAA AGCTTGGATC TTTGCCCTCT GT - #TTAAATCC      720     - TGCGTTTTCG GTTTAATCTA ATTGAAGATT GATCATTTTT CTGTGATTGC AG - #TTGGAAAA      780     - CA ATG GTG GCC ACC GCT GCA AGT TCT GCA TTC - # TTC CCC CTG CCG TCC      827     #Phe Phe Pro Leu Pro Ser Ala Ser Ser Ala     #  15     - CCG GAC ACC TCC TCT AGG CCG GGA AAG CTC GG - #A AAT GGG TCA TCG AGC      875     Pro Asp Thr Ser Ser Arg Pro Gly Lys Leu Gl - #y Asn Gly Ser Ser Ser     #                 30     - TTG AGC CCC CTC AAG CCC AAA TTT GTC GCC AA - #T GCC GGG TTG AAG GTT      923     Leu Ser Pro Leu Lys Pro Lys Phe Val Ala As - #n Ala Gly Leu Lys Val     #             45     - AAG GCA AGC GCC AGT GCC CCT CCT AAG ATC AA - #T GGT TCC TCG GTC GGT      971     Lys Ala Ser Ala Ser Ala Pro Pro Lys Ile As - #n Gly Ser Ser Val Gly     #         60     - CTA AAG TCC GGC AGT CTC AAG ACT CAG GAA GA - #T ACT CCT TCG GTG CCT     1019     Leu Lys Ser Gly Ser Leu Lys Thr Gln Glu As - #p Thr Pro Ser Val Pro     #     75     - CCT CCG CGG ACG TTT ATC AAC CAG TTG CCT GA - #T TGG AGT ATG CTT CTT     1067     Pro Pro Arg Thr Phe Ile Asn Gln Leu Pro As - #p Trp Ser Met Leu Leu     # 95     - GCT GCA ATC ACT ACT GTC TTC TTG GCA GCA GA - #G AAG CAG TGG ATG ATG     1115     Ala Ala Ile Thr Thr Val Phe Leu Ala Ala Gl - #u Lys Gln Trp Met Met     #               110     - CTT GAC TGG AAA CCT AAG AGG CCT GAC ATG CT - #T GTG GAC CCG TTC GGA     1163     Leu Asp Trp Lys Pro Lys Arg Pro Asp Met Le - #u Val Asp Pro Phe Gly     #           125     - TTG GGA AGT ATT GTC CAG GGT GGG CTT GTG TT - #C AGG CAA AAT TTT TCT     1211     Leu Gly Ser Ile Val Gln Gly Gly Leu Val Ph - #e Arg Gln Asn Phe Ser     #       140     - ATT AGG TCC TAT GAA ATA GGC GCT GAT CGC AC - #T GCA TCT ATA GAG ACG     1259     Ile Arg Ser Tyr Glu Ile Gly Ala Asp Arg Th - #r Ala Ser Ile Glu Thr     #   155     - GTG ATG AAC CAC TTG CAG GTACTGGTGC ATCCTGCAGT TA - #AACTATTC     1307     Val Met Asn His Leu Gln     160                 1 - #65     - AATTCATGAA ATGCTTATGT CCAGTAACCG AGCCATACTT GACTATGGCT TA - #CCAAATTT     1367     - AAGGGTGAAT TTGAGAAAGA AGGGTTGTAC TGCATTCCTC TCTATTGTCA TG - #AGGTGATT     1427     - CAATATAGGT TTACCTCGTG TCAATTTTTA ACATATGCAT TCATTTCATG AC - #GCTTTGGC     1487     - TTCTTATGGT GAGCTTTGTC ATGTCGAGTC AATGGAAGGA TCATCTGCTC TA - #CTATATTA     1547     - TTATTGAATT TAGGCATGAT GAAAGTTTAT GTGGAACTTA GTTACTTCTG TG - #ATAGAAAA     1607     - CCAAGAAAGG AAAGCTTCCT CTTCCCTTTA ACCTTAAAAA AAAAACCTTA AC - #TCTCATTT     1667     - CAATTGAAAA CTGGATTAGT TTTCAGATAT GTATATAATG ATTAAACATT TG - #CATTAGTT     1727     - TGCTCATAAT TTTGGTTGAA TTCATTGTCT TTGTCCTGTG CTTTTTTTTT CT - #TTTACAG     1786     - GAA ACG GCT CTC AAT CAT GTT AAG AGT GCT GG - #G CTT CTT AAT GAC GGC     1834     Glu Thr Ala Leu Asn His Val Lys Ser Ala Gl - #y Leu Leu Asn Asp Gly     #               180     - TTT GGT CGT ACT CCT GAG ATG TTT AAA AGG GA - #C CTC ATT TGG GTT GTC     1882     Phe Gly Arg Thr Pro Glu Met Phe Lys Arg As - #p Leu Ile Trp Val Val     #           195     - GCG AAA ATG CAG GTC ATG GTT AAC CGC TAT CC - #T ACT  TG  GTAAGTTTGT     1930     Ala Lys Met Gln Val Met Val Asn Arg Tyr Pr - #o Thr  Trp     #        210     - CACTAGCTTT TTACTTTGCG GTACTTCGAG GCTTTATAAA ATTTTGTGTC AA - #TGTAGCTG     1990     - TAATGTATAT CATATTGTAA TGAGTGCTCA CTGTTACCTT CCTTGTGATA TG - #GTGTTTCA     2050     - TTTCAATATA ACACCGATGA CTACAAATCT CCTTTATGTT GTGGAACCTA AG - #GGCCCTGT     2110     #TGG GTT GCC AAG TCA     2159 GAA GTG AAT ACT                 Gly Asp Thr Val - # Glu Val Asn Thr Trp Val Ala Lys Ser     #          220     - GGG AAA AAT GGT ATG CGT CGT GAT TGG CTC AT - #A AGT GAT TGC AAT ACA     2207     Gly Lys Asn Gly Met Arg Arg Asp Trp Leu Il - #e Ser Asp Cys Asn Thr     #   235     - GGA GAA ATT CTA ACT AGA GCT TCA  AG  GT - #ATGATGCA CTGTTTTGTA     2253     Gly Glu Ile Leu Thr Arg Ala Ser  Ser     240                 2 - #45     - GTTTATGTTC CTGTACTTTC TAGTGGTCAG ATTTGAGAGC ATTCAATCGG GA - #TATTTTAC     2313     #GTC ATG ATG     2367TT ACCCTTTTAT TATTGCAG C GTG TGG     #        Val Trp Val Met Met     #            250     - AAT CAA AAG ACA AGA AAA TTG TCA AAA ATT CC - #A GGT GAG GTT CGA CAT     2415     Asn Gln Lys Thr Arg Lys Leu Ser Lys Ile Pr - #o Gly Glu Val Arg His     #   265     - GAG ATA GAG CCT CAT TTT ATA GAC TGT GCT CC - #C GTC ATT GAA GAC GAT     2463     Glu Ile Glu Pro His Phe Ile Asp Cys Ala Pr - #o Val Ile Glu Asp Asp     270                 2 - #75                 2 - #80                 2 -     #85     - GAC CGG AAA CTC CGC AAG CTG GAT GAG AAG AC - #T GCT GAC TCC ATC CGC     2511     Asp Arg Lys Leu Arg Lys Leu Asp Glu Lys Th - #r Ala Asp Ser Ile Arg     #               300     - AAG GGT CTA ACT GTAAGGCCAT ATTTTACACT TTAATAGTGG CT - #TGCATTGC     2563     Lys Gly Leu Thr                 305     #CCG AAG    2620CATGCTT CTTAGACGAT TTTCCTCTTC GCAATTTGTA G     #   Pro Lys     - TGG AAT GAC TTG GAT GTC AAT CAG CAT GTC AA - #C AAC GTG AAG TAC ATC     2668     Trp Asn Asp Leu Asp Val Asn Gln His Val As - #n Asn Val Lys Tyr Ile     #       320     - GGG TGG ATT CTC GAG GTAACTTTTT AACCTGTTAG CTGAATATG - #T GTGTATCTCG     2723     Gly Trp Ile Leu Glu         325     - ATAAGATATA TGAACGTAGA TATTGACCCA AGTAACTGCT AGCACATCAT AT - #GTCCCTGA     2783     - AGTCCATTTA CAGTTATCAT ATTGCTAAAC TAATTATGCT GTTTCCTACA TA - #AACAATGT     2843     #GAG ACC CAA GAG      2893CT CCA CAA GAA GTT CTG     #Glur Thr Pro Gln Glu Val Leu Glu Thr Gln     #     335     - TTA TCT TCC CTT ACC CTG GAA TAC AGG CGG GA - #A TGC GGA AGG GAG AGC     2941     Leu Ser Ser Leu Thr Leu Glu Tyr Arg Arg Gl - #u Cys Gly Arg Glu Ser     340                 3 - #45                 3 - #50                 3 -     #55     - GTG CTG GAG TCC CTC ACT GCT GTG GAC TCC TC - #T GGA AAG GGC TTT GGG     2989     Val Leu Glu Ser Leu Thr Ala Val Asp Ser Se - #r Gly Lys Gly Phe Gly     #               370     - TCC CAG TTC CAA CAC CTT CTG AGG CTT GAG GA - #T GGA GGT GAG ATC GTG     3037     Ser Gln Phe Gln His Leu Leu Arg Leu Glu As - #p Gly Gly Glu Ile Val     #           385     - AAG GGG AGA ACT GAG TGG CGA CCC AAG ACT GC - #A GGT GTC AAT GGG GCA     3085     Lys Gly Arg Thr Glu Trp Arg Pro Lys Thr Al - #a Gly Val Asn Gly Ala     #       400     - ATA GCA TCC GGG GAG ACC TCA CAT GGA GAC TC - #T TAGAAGGGAG CCCCGGTCCC     3138     Ile Ala Ser Gly Glu Thr Ser His Gly Asp Se - #r     #   410     - TTTCGAGTTC TGCTTTCTTT ATTGTCGGAT GAGCTGAGTG AACGGCAGGT AA - #GGTAGTAG     3198     - CAATCAGTGG ATTGTGTAGT TTATTTGCTG TTTTTCACTT CGGCTCTCTT GT - #ATAATATC     3258     - ATGGTCTTCT TTGTGTTCTC GCATGTTTCG GGCTGATTTA TATATTATAT TC - #TTTCTATT     3318     - TGTTTCAAGG TGAGTAGCGA GTTGTAATTA TTTATTTTGT CGTTAAATTT TC - #AAATGAAA     3378     - GTACTTATGT GAACTGCATC GCCTTCCCTC AGAAGGTATC ATAATGAATT GT - #TACCATGT     3438     - TGCTGCGCTG CGAGTCTGTT ACTTGTCATA TGCCGGTGTG GTTTGGTTTG GT - #GTGCTGTT     3498     - CTGTTCTGTG TTGGAGTGGC TTAATTCGGA TAATATGTTT GTTTTATCAC CA - #GGGAGCAC     3558     - AAGAAACCGC AAACAAGATA AGACACTGCT CATGGCTACA TTTGCTTAAC CT - #AGTAGGAC     3618     - CAAAAGCAAC TCGAAAATCG AAAATGGGTG ATCATAGATT GATCTATGTT AA - #ATTGTAAC     3678     - CAACCTTTTG AAATTAATGA TGTTGATTCT TTGGTTGAGG AACATTTTGC TT - #CCTCAAAT     3738     - TCATTGTATT TTGTTGCTTC ATTCAATGTG GTTTAATTTA ACCAGAGTTA TA - #TTGTTCAT     3798     - TTATTTTCTC CTTCCCCTTT TTAGTAGATA CATTGCCCCG GCTTCGCCCG GA - #TATGCTAC     3858     - TATCGTATAT TATTTAATTA CAAACTATAT ATATATGTAT ATGTACATAT CA - #CTTTCGAT     3918     - CAATTGTTAA AATAATAATA AACATTTAGT CACGATCATA AAATCATATA AA - #AGTTGGTA     3978     - CAATTAGAGG TTGTGAAAGT TGGTACAATT AGAGGTCGTG AACTATGAAC AA - #AGAAATAC     4038     - ATATATCAAA GAAATACATA TATCAAAGCT TAAATGATCT TGCTCAAGAT AT - #TATAATAT     4098     - ATATTGTGCT CTGTATAATA TAAGTAATAT ATATTGTGTG CTCTATACCA CA - #TCTTCAGA     4158     - CACGAAGAAG GTTTCCAACT TAAAGGAAAG AATGTGTTCC TGTAAAGTTT TG - #CAAGAAAA     4218     - TAACCTACGT AAAGACTTTG AAAAGTATGC CGTTGTGATA AAAGAACTCG CA - #ACAATATT     4278     - AATTGTCGAA GATTCGAACC ATGCAAACTG AAAGCCTATA TGAGGAATAC AT - #CACGAGAA     4338     - AAAATATTAA TAGCATAGCA AAAGAAAGAG AAAGTCAGAA TTTCTCTACC CA - #ATATCGTT     4398     - GAAACATGCG ATGGTAGAGC AACACACAGG AGCGGTAAGA ACCGACGAAG GA - #GTTGGTGT     4458     - AAGATAAATA AACATTCAAA ATATCAATTA AATAAACTAT CAAGATATGG TG - #ACACTATC     4518     - GATTAGACGC GAAACGCCTT TAACAGAGGA TATTGCGCAA GTTCCTCTAC CA - #ACTAATTG     4578     - TGACATATCT CACATCCCGA TGTCTGATGA TACAAAAGAT CCCGAGAATC AC - #AACCGGCG     4638     #          4643     - (2) INFORMATION FOR SEQ ID NO: 7:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 5467 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: DNS (genomic)     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Cuphea la - #nceolata     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: genomic La - #mbda FIX II               (B) CLONE: ClTEg16     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: join(3294..3 - #779, 4045..4178, 4282..4395,     4512                    ..4680, 4 - #767..4835, 5012..5275)     -     (ix) FEATURE:               (A) NAME/KEY: Startcodon               (B) LOCATION: 3294..3296     -     (ix) FEATURE:               (A) NAME/KEY: exon II               (B) LOCATION: 3284..3779     -     (ix) FEATURE:               (A) NAME/KEY: intron II               (B) LOCATION: 3780..4044     -     (ix) FEATURE:               (A) NAME/KEY: exon III               (B) LOCATION: 4045..4178     -     (ix) FEATURE:               (A) NAME/KEY: intron II - #I               (B) LOCATION: 4179..4281     -     (ix) FEATURE:               (A) NAME/KEY: exon IV               (B) LOCATION: 4282..4395     -     (ix) FEATURE:               (A) NAME/KEY: intron IV               (B) LOCATION: 4396..4511     -     (ix) FEATURE:               (A) NAME/KEY: exon V               (B) LOCATION: 4512..4680     -     (ix) FEATURE:               (A) NAME/KEY: intron V               (B) LOCATION: 4681..4766     -     (ix) FEATURE:               (A) NAME/KEY: exon VI               (B) LOCATION: 4767..4835     -     (ix) FEATURE:               (A) NAME/KEY: intron VI               (B) LOCATION: 4836..5011     -     (ix) FEATURE:               (A) NAME/KEY: exon VII               (B) LOCATION: 5012..5275     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 5276..5278     #7:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - GTCGACTCGA TCCACCCAAC TTAATGCAAG TGGCTCTTAA ACTCTTGCTT GT - #TTGCTTGC       60     - TGCACTTGTC ATGCAGGTTG GTGGAATCTA TGTGAGGCTG TTCTTGAAAG AC - #CCCAAGTT      120     - TCCTCTCCGA AATCCGAAGA GGTTCCTTGA AGGTCTCCTG GATCAGTATC TA - #TCAGCAGT      180     - GGCTGCAACA CACTATGAAA CGCAAGTGGA CCCCGAGCTT CCCTTGCTTT TA - #TCAGCTGC      240     - CCTAGTTTCT TTACTGCGAG TTCACCCTGC ACTCGCTGAT CATGTGGGTT AT - #CTCGGCTA      300     - TGTGCCTAAG CTTGTTGCTG CTGTTGCCTA TGAAAGTAGA AGAGAAACAA TG - #TCCTCAGT      360     - GGAGGAGAAT AATGGCCACG CAGACAGAGC AGCCTATGAG CCTGGTGATG GG - #TTAGAACA      420     - ACCCACTCAG ACCCCACGAG AGCGAGTCCG ACTCAGCTGC TTACGTGTTT TG - #CATCAGCT      480     - TGCAGCGAGT ACAACTTGTG CTGAAGCTAT GGCTGCAACT AGTGTTGGGA CA - #CCACAGGT      540     - AGATCTTATT TCTCGTATAT GTATATGCAT TGGTGTCTGC AATTTACATG AT - #TAGCTAAG      600     - AAGAATGTTC CTGATATATG TCAAAGATTC TTCCGAGTTG AATGCCCTGA CA - #GGTTCATG      660     - CATACCTTGA GTTGCAGGTT GTTCCAATTC TAATGAAAGC AATAGGCTGG CA - #AGGCGGAA      720     - GTATATTAGC CCTTGAGACA CTGAAACGGG TTGTTGTCGC TGGAAATCGG GC - #TAGGGATG      780     - CCCTGGTGGC TCAAGGACTC AAGTAAGTTT ATTATCGGAT ACAGGGCCTT CC - #ATACTTCG      840     - ATAGAAGTTC ATTCTCGTGT CTGATTGAGT GAAATTTTCA GGGCTGGTCT AG - #TTGAAGTC      900     - CTTCTCGGGC TTCTTGACTG GAGAGCTGGA GGAAGACATG GACTCTGTGC TC - #AGATGAAG      960     - TGAACGAATC TGAAGCATCT ATTGGAAGGG TTCTTGCCAT AGAGGTCAGG AT - #AGTTAACT     1020     - TTATTTTGTC TGCAGTATCG TGACATTGTT GCCTCACGAT ATGCCGTTAA TT - #TTTTGACC     1080     - GCCAAACACG GGTGTAAAAA AAAAAGTATC TTAAATGTAT GACTCAGGTT TA - #TCACGTCA     1140     - TTTGCAACCG AAGGGGCCCA TTGTACTAAA GTGCGTGAAC TGCTGGATTC GT - #CTGATGTA     1200     - AGTTACCTCA GCTTTCTTCT GTTGTGTCTT TATCCTGCAA ACCTTTTCAT GC - #AGTTGGCG     1260     - ATATCTTAGG GCCGGCATGG TGGTTGCTCG TTGCTTGATA TTATAGTCGA GT - #TAGATATT     1320     - GTGATTCCAG TAATGTAATA TTTTGCACTT GCATGTTGCC AATGGTCATA AT - #CAGTGTTG     1380     - TCTAGAGAAT AGTATTTGGA TCTTTTCTAA ATATCGAGTT CTGATATGCT AA - #TCCTAAAT     1440     - CTTATCTTTT TAACCTCTCT TTTCTTTGAT TGTTTTCAGG TTTGGGGTGC AT - #ACAAAGAC     1500     - CAAAAGCACG ACCTCTTCCT TCCATCAAGT GCTCAGTCCG CTGCTGGAGT GG - #CTGGCTTG     1560     - ATTGAGAACT CGTCCTCTCG ACTCACGTAT GCCCTCACAG CCCCGCCTCC CA - #CATCATCT     1620     - CCTCCATCAT ACTCCAATGG CAACGAAGAT ATCTTCCATC TGTAAAGACA AG - #TCCTGTAG     1680     - TGATATAAAA TAGCTCATTT CTGTACAGGT TTTCGTTGGC TTTAGTCATC AG - #GCTTTCGA     1740     - GTTTGTTCAT GTTTCGTTTC TTCTTACATC ATATATATCC TTGGGGGCGT TG - #CAGATTGG     1800     - CATGGCGTTT TCATTTTCAA TCTCCTGATA TCAAACCTTG GAATTTATTC CT - #TTGCTTCA     1860     - TTTTTACTCC ACACTCCACT GTAAAGATCA CTCGATCATT TATGTGTAAA TT - #GAGGTTCT     1920     - GGTTGCTTTC TGCACATTTT TTATATGATC ATTTTCAATG GTCACTATTT CT - #TCTGTATC     1980     - ACTAAAGAGC CTATATTAAT AAATAAAGAT TCATCATCAT CCCATTCATA TA - #TTTGCTCT     2040     - ATTCCTATGT ATAATATTAT TTTCATTCAA AAATTGTTTG TGAATTCCGA CT - #TCAATGAG     2100     - ATTCTAAATT TAGAATCCCA TGCCAACTAA GATAGACTCT AATGTAGATT CA - #AATTATTT     2160     - TGAAGACTCT AAATTGACAT TTAAAAAGTT TTTATGGAGA TGTTCTAAGC GG - #CACCTTCA     2220     - TAAGAATTAA AAATACTAAA TAAATTTTTT AGTGAAAGGT CAAATGTGCC TA - #TAATAAGT     2280     - AAAGAAAAGT TATTATTAAT GATTTATTAA AGTAATATCT CTTTTTTTTT TT - #TTTACAAG     2340     - TTCTAATATT TGAAGATAAA AAAAAAAAAA AAATTACACG TGAAAGCTGA AA - #TGAAACTC     2400     - AAACTCCCCT GACACCTTTC GCTTCGCACT GTCTCTGTCT TCTAAAATCC AC - #GAGTCGGG     2460     - AAAGAAAGAT TCAATTTGAT TCACTGTTGA CGAAGCTGAA GATCACAAAT TT - #CCAACCTC     2520     - AGGATACCTC TTACCTTTGC CTTTGCCTTT GCTTTTTTCT TTGCCTCTCT TC - #TCTTCATT     2580     - CGGCTCTGTC CCTTCCCCTC TTCGCGTTGC TTCTTCTATT GAACTGTTGT CT - #GTTCATGT     2640     - CACCGTTTGC CCTTCCACTT CAGCTATATG GCCCTCTCTC TCTCGCACTA CG - #TGTCTGAT     2700     - CTGCAGTTTC CATTCCCGCT TCTGTCTCCT TCCTTCACAA GACTTCATTT GC - #ATACACCA     2760     - CTGACCTGAG CCCCACCCAC CCTCCGTCAC CCAGTGTCAC TCTTCTGCAA AC - #CCATCTGC     2820     - TCTCTTCTTT TTCCCTCCAC CGTAGCCCAT AGAAACCACC TTCGCCCTTT TC - #CTCCTCGT     2880     - GTTGATCGGA CCTCATCATG TCTCCTTTCT TTCTGCCAAA TGTCTGGCCT TT - #CTTCTCGC     2940     - GCCCACTTTT GTTTTTAATG TTATGCTCCC AGCCACGTTC CTTCCATTCT CT - #GCTCTCCT     3000     - CATGGCTCCT TCCGGGCCAC CATCAGAGTC CCCTTCTTTA TATGGCTTCC AT - #TTTCCTTC     3060     - CTTGATGGAT ATCCCATCTT CATCTGTGTT TGCTGGATAC CATTTTCCCT GT - #ATGTTCAG     3120     - TTCATGCCAT GCATGTCTAT GCCTTTCTTT CCCCTTACTA CATTTGCTGT AA - #CATTGTGT     3180     - TGTGTTTTGT CATAAAGCTT TCATCTTTCC CTTCTGTTTG AATCCGAGGT TG - #TCTTTTTT     3240     - ATGCATTTCA AGCTTCTGAT GACTGACCCT TTTGTGCTTT CAGGCGAACA AA - #C ATG     3296     #     Met     #       1     - GTG GCT GCC GCA GCA AGC TCT GCA TTC TTC TC - #C TTT CCA ACC CCC GGA     3344     Val Ala Ala Ala Ala Ser Ser Ala Phe Phe Se - #r Phe Pro Thr Pro Gly     #              15     - ACC TCC CCC AAA CCC GGG AAG TTC GGC AAC TG - #G CCA TCG AGC CTG AGC     3392     Thr Ser Pro Lys Pro Gly Lys Phe Gly Asn Tr - #p Pro Ser Ser Leu Ser     #         30     - GTC CCC TTC AAT CTC AAA TCA AAC CAC AAT GG - #T GGC TTT CAG GTT AAG     3440     Val Pro Phe Asn Leu Lys Ser Asn His Asn Gl - #y Gly Phe Gln Val Lys     #     45     - GCA AAC GCC AGT GCT CAT CCT AAG GCT AAC GG - #T TCT GCA GTA AGT CTA     3488     Ala Asn Ala Ser Ala His Pro Lys Ala Asn Gl - #y Ser Ala Val Ser Leu     # 65     - AAG GCT GGC AGC CTC GAG ACT CAG GAG GAC AC - #T TCA GCG CCG TCC CCT     3536     Lys Ala Gly Ser Leu Glu Thr Gln Glu Asp Th - #r Ser Ala Pro Ser Pro     #                 80     - CCT CCT CGG ACT TTC ATT AAC CAG TTG CCT GA - #C TGG AAT ATG CTT CTG     3584     Pro Pro Arg Thr Phe Ile Asn Gln Leu Pro As - #p Trp Asn Met Leu Leu     #             95     - TCC GCA ATC ACG ACT GTC TTC GTT GCG GCT GA - #G AAG CAG TGG ACG ATG     3632     Ser Ala Ile Thr Thr Val Phe Val Ala Ala Gl - #u Lys Gln Trp Thr Met     #       110     - CTT GAT CGG AAA TCT AAG AGG TCA GAC GTG CT - #C GTG GAA CCG TAT GTT     3680     Leu Asp Arg Lys Ser Lys Arg Ser Asp Val Le - #u Val Glu Pro Tyr Val     #   125     - CAG GAT GGT GTT TCG TTC AGA CAG AGT TTT TC - #G ATA AGG TCT TAC GAA     3728     Gln Asp Gly Val Ser Phe Arg Gln Ser Phe Se - #r Ile Arg Ser Tyr Glu     130                 1 - #35                 1 - #40                 1 -     #45     - ATT GGC GCT GAT CGA ACA GCC TCA ATA GAG AC - #G CTG ATG AAC CAT CTT     3776     Ile Gly Ala Asp Arg Thr Ala Ser Ile Glu Th - #r Leu Met Asn His Leu     #               160     - CAG GTACTGCATT GAAACTATTC AACCATAGCA TTGCTAGTGA TCTGTAAAT - #G     3829     Gln     - AGCCACGACT GACGATGACA TAGATACACC GAATTGCCAG TATATGTGTG TC - #CATTTTAA     3889     - TTTAGAGCTG ATGTTATTAT AAGTTCATGA TGAGGTTGGT ATCTCAGGAT GA - #GATTTGTA     3949     - AGGTTGATTC AAGGGAGGAA CCATAACATA TGTTTGATTG TATTTCCTCG TT - #AACTCCAT     4009     #CTG AAT CAT      4062T TTTTTCTCTA AACAG GAA ACA TCT     #   Glu Thr Ser Leu Asn His     #           165     - TGT AAG AGT CTC GGT CTT CTC AAT GAC GGC TT - #T GGT CGT ACT CCT GAG     4110     Cys Lys Ser Leu Gly Leu Leu Asn Asp Gly Ph - #e Gly Arg Thr Pro Glu     #   180     - ATG TGT AAG AGG GAC CTC ATT TGG GTG GTT AC - #G AAA ATG CAG GTA ATG     4158     Met Cys Lys Arg Asp Leu Ile Trp Val Val Th - #r Lys Met Gln Val Met     185                 1 - #90                 1 - #95                 2 -     #00     - GTG AAT CGC TAT CCT ACT  TG  GTAAGTTTGT CT - #CTGCTTGT TTGTCTTATG     4208     Val Asn Arg Tyr Pro Thr  Trp                     205     - GTCCACAAAT CTCTCTTACG GTAATGGTTG TAAACATAGT GGAATGTAAT GG - #CATGTGTG     4268     #ACT TGG GTC TCC GAG    4318CT ATC GAG GTC ACT     #Asp Thr Ile Glu Val Thr Thr Trp Val Ser G - #lu     #       215     - TCG GGA AAA AAC GGT ATG AGT CGT GAT TGG CT - #G ATA AGT GAT TGC CAT     4366     Ser Gly Lys Asn Gly Met Ser Arg Asp Trp Le - #u Ile Ser Asp Cys His     220                 2 - #25                 2 - #30                 2 -     #35     # GTAGACTTTT CTGGTTCTGA      4415 ACG  AG     Ser Gly Glu Ile Leu Ile Arg Ala Thr  Ser     #                245     - TTTTACATTC TTAAACCTTC TGATGTTCGA CTGAGAGCAG ACATTTGGTA TG - #TTTTATAT     4475     - TGAAAGTTGA GTCAAGTCAC TCTAATACTA TCGCAG C GTG TGG G - #CT ATG ATG     4527     #      Val Trp Ala Met Met     #    250     - AAT CAA AAG ACA AGA AGA TTG TCA AAA ATT CC - #A GAT GAG GTT CGA CAG     4575     Asn Gln Lys Thr Arg Arg Leu Ser Lys Ile Pr - #o Asp Glu Val Arg Gln     #               265     - GAG ATA GTG CCT TAT TTT GTG GAC TCT GCT CC - #T GTC ATT GAA GAC GAT     4623     Glu Ile Val Pro Tyr Phe Val Asp Ser Ala Pr - #o Val Ile Glu Asp Asp     #           280     - CGA AAA TTG CAC AAG CTT GAT GTG AAG ACG GG - #T GAT TCC ATT CGC AAT     4671     Arg Lys Leu His Lys Leu Asp Val Lys Thr Gl - #y Asp Ser Ile Arg Asn     #       295     - GGT CTA ACT GTAAGTCCCT ATATTTCAAT ATGAAATGTG GCGCGTTTC - #A     4720     Gly Leu Thr         300     #AGG TGG      4775CTGAG GCGATCTATC TCTTCACGGT CTGTAG CCA     #Trp            Pro Arg     - AAT GAC TTT GAT GTC AAT CAG CAC GTT AAC AA - #T GTG AAG TAC ATT GCG     4823     Asn Asp Phe Asp Val Asn Gln His Val Asn As - #n Val Lys Tyr Ile Ala     305                 3 - #10                 3 - #15                 3 -     #20     - TGG CTT CTC AAG GTACCCTTTT CATCATACAA ACAACTGATA TA - #TATATCTG     4875     Trp Leu Leu Lys     - CTCGCCCAAG TATCTGCTTG CTAGCACTTG AGATATTACT TAAATATCGT GG - #ATTAGTAT     4935     - TGCCCCGAGT TTGTCAATGC TTGATTTACA CAGTTCAGCT AAACAAATCT GT - #AATCTATA     4995     #GAG ACC CAG GAG       5044 CCA ACA GAA GTT TTC     #Ser Val Pro Thr Glu Val Phe Glu Thr Gln G - #lu     #335     - CTA TGC GGC CTC ACC CTT GAG TAT AGG CGG GA - #A TGC AGA AGG GAC AGT     5092     Leu Cys Gly Leu Thr Leu Glu Tyr Arg Arg Gl - #u Cys Arg Arg Asp Ser     #               350     - GTG CTG GAG TCC GTG ACC GCT ATG GAT CCC TC - #A AAA GAG GGA GAC CGG     5140     Val Leu Glu Ser Val Thr Ala Met Asp Pro Se - #r Lys Glu Gly Asp Arg     #           365     - TCT CTG TAC CAG CAC CTT CTT CGG CTT GAG AA - #T GGG GCT GAT ATC GCC     5188     Ser Leu Tyr Gln His Leu Leu Arg Leu Glu As - #n Gly Ala Asp Ile Ala     #       380     - TTG GGT AGA ACC GAG TGG CGG CCG AAG AAT GC - #A GGA GCC AAT GGG GCA     5236     Leu Gly Arg Thr Glu Trp Arg Pro Lys Asn Al - #a Gly Ala Asn Gly Ala     #   395     - GTA TCA ACA GGA AAG ACT TCA AAT GGA AAT TC - #T GTC TCT TAGAAGTGGC     5285     Val Ser Thr Gly Lys Thr Ser Asn Gly Asn Se - #r Val Ser     400                 4 - #05                 4 - #10     - TGGGGGCCTT TCCAAGTTGT GCGTTTATTT TTTCTGAAAG AAGGGAATGT TG - #CTGCAATC     5345     - AGTAAACTGT GTAGTTCGTT TGCAGTTTGT ATATTAACAC GGTCGGTCGT GT - #TTGTATTT     5405     - GCTAAGACAA ATAGCACATT CATCGTTACA TATCGTAGAT CTCGAACAGT AC - #TGTCAAGC     5465     #            5467     - (2) INFORMATION FOR SEQ ID NO: 8:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 530 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: c-DNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -     (vi) ORIGINAL SOURCE:               (A) ORGANISM: Umbellularia - # californica     -    (vii) IMMEDIATE SOURCE:               (A) LIBRARY: c-DNA Bank - # ZAP     -     (ix) FEATURE:     #PCR 42   (A) NAME/KEY: PCR-Product               (B) LOCATION: 1..530     -     (ix) FEATURE:               (A) NAME/KEY: Oligonucleot - #ide primer               (B) LOCATION: 1..23     -     (ix) FEATURE:               (A) NAME/KEY: CDS               (B) LOCATION: 1..327     -     (ix) FEATURE:               (A) NAME/KEY: Stopcodon               (B) LOCATION: 328..330     #8:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     - TGG AAT GAC TTG GAT GTG AAC CAG CAC GTT AA - #C AAT GTG AAG TAC ATT       48     Trp Asn Asp Leu Asp Val Asn Gln His Val As - #n Asn Val Lys Tyr Ile     #                 15     - GCG TGG CTT CTC AAG AGT GTT CCA ACA GAA GT - #T TTC GAG ACC CAG GAG       96     Ala Trp Leu Leu Lys Ser Val Pro Thr Glu Va - #l Phe Glu Thr Gln Glu     #             30     - CTA TGC GGC CTC ACC CTT GAG TAT AGG CGG GA - #A TGC AGA AGG GAC AGT      144     Leu Cys Gly Leu Thr Leu Glu Tyr Arg Arg Gl - #u Cys Arg Arg Asp Ser     #         45     - GTG CTG GAG TCC GTG ACC GCT ATG GAT CCC TC - #A AAA GAG GGA GAC CGG      192     Val Leu Glu Ser Val Thr Ala Met Asp Pro Se - #r Lys Glu Gly Asp Arg     #     60     - TCT CTG TAC CAG CAC CTT CTT CGG CTT GAG AA - #T GGG GCT GAT ATC GCC      240     Ser Leu Tyr Gln His Leu Leu Arg Leu Glu As - #n Gly Ala Asp Ile Ala     # 80     - TTG GGT AGA ACC GAG TGG CGG CCG AAG AAT GC - #A GGA GCC AAT GGG GCA      288     Leu Gly Arg Thr Glu Trp Arg Pro Lys Asn Al - #a Gly Ala Asn Gly Ala     #                 95     - GTA TCA ACA GGA AAG ACT TCA AAT GGA AAT TC - #T GTC TCT TAGAAGTGGC      337     Val Ser Thr Gly Lys Thr Ser Asn Gly Asn Se - #r Val Ser     #           105     - TGGGGGCCTT TCCAAGTTGT GCGTTTATTT TTTCTGAAAG AAGGGAATGT TG - #CTGCAATC      397     - AGTAAACTGT GTAGTTCGTT TGCAGTTTGT ATATTAACAC GGTCGGTCGT GT - #TTGTATTT      457     - GCTAAGACAA ATAGCACATT CATCGTTACA AAAAAAAAAA AAAAAAAAAG CT - #TCCTAGGT      517     #     530     - (2) INFORMATION FOR SEQ ID NO: 9:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 23 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: DNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -    (vii) IMMEDIATE SOURCE:               (B) CLONE: 5'- Prime - #r 3532     #9:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     #                23TNAA YGA     - (2) INFORMATION FOR SEQ ID NO: 10:     -      (i) SEQUENCE CHARACTERISTICS:     #pairs    (A) LENGTH: 41 Base               (B) TYPE: nucleic acid     #stranded (C) STRANDEDNESS: double               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE:: DNA     -    (iii) HYPOTHETICAL: No     -    (iii) ANTI-SENSE: No     -    (vii) IMMEDIATE SOURCE:               (B) CLONE: 3'- Prime - #r 2740     #10:  (xi) SEQUENCE DESCRIPTION: SEQ ID NO:     #   41             TTCG AAGGATCCAA GCTTGTCGAC T     - (2) INFORMATION FOR SEQ ID NO:11:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 366 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:     #His Leu His Thr Phe Serys Asn Val Thr Asn     #                 15     #Val Asn Arg Arg Thr Leuer Leu Phe Ile Pro     #             30     #Ala Leu Asp Pro Leu Argln Pro Arg Lys Pro     #         45     #Ser Pro Val Asn Ser Cyssp Gln Gly Ser Ile     #     60     #Leu Met Glu Asp Gly Tyrhe Arg Ala Gly Arg     # 80     #Tyr Glu Val Gly Ile Asnhe Ile Val Arg Ser     #                 95     #Leu Leu Gln Glu Val Alalu Thr Ile Ala Asn     #            110     #Thr Asp Gly Phe Ala Thrys Cys Gly Phe Ser     #        125     #Trp Val Thr Ala Arg Metys Leu His Leu Ile     #    140     #Ser Asp Val Val Glu Ileys Tyr Pro Ala Trp     #160     #Gly Thr Arg Arg Asp Trper Glu Gly Arg Ile     #                175     #Ile Gly Arg Ala Thr Serla Thr Asn Glu Val     #            190     #Arg Leu Gln Arg Val Thrsn Gln Asp Thr Arg     #        205     #Cys Pro Arg Glu Pro Arglu Tyr Leu Val Phe     #    220     #Leu Lys Lys Ile Pro Lyslu Asn Asn Ser Ser     #240     #Glu Leu Lys Pro Arg Argln Tyr Ser Met Leu     #                255     #Asn Val Thr Tyr Ile Glysn Gln His Val Asn     #            270     #Ile Asp Thr His Glu Leule Pro Gln Glu Ile     #        285     #Cys Gln Gln Asp Asp Ilesp Tyr Arg Arg Glu     #    300     #Asp Asp Pro Ile Ser Lyshr Ser Glu Ile Pro     #320     #Ser Ile Gln Gly His Asnly Ser Ala Thr Ser     #                335     #Ser Glu Asn Gly Gln Gluis Met Leu Arg Leu     #            350     #Lys Ser Ser Argly Arg Thr Gln Trp Arg Lys     #        365     - (2) INFORMATION FOR SEQ ID NO:12:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 382 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:     #Cys Ser Met Lys Ala Valeu Ala Ser Ala Phe     #                 15     #Pro Arg Ser Ser Asp Leuly Arg Gly Met Lys     #             30     #Leu Lys Met Ile Asn Glysn Ala Pro Thr Ser     #         45     #Arg Leu Pro Asp Trp Serhr Glu Ser Leu Lys     #     60     #Ser Ala Ala Glu Lys Glnle Thr Thr Ile Phe     # 80     #Lys Leu Pro Gln Leu Leurp Lys Pro Lys Pro     #                 95     #Phe Arg Arg Thr Phe Alaeu His Gly Leu Val     #            110     #Ser Thr Ser Ile Leu Alaal Gly Pro Asp Arg     #        125     #Asn His Ala Lys Ser Valln Glu Ala Thr Leu     #    140     #Leu Glu Met Ser Lys Argly Phe Gly Thr Thr     #160     #Val Ala Val Glu Arg Tyral Arg Arg Thr His     #                175     #Cys Trp Ile Gly Ala Serhr Val Glu Val Glu     #            190     #Val Arg Asp Cys Lys Thrrg Arg Asp Phe Leu     #        205     #Ser Val Leu Met Asn Thrrg Cys Thr Ser Leu     #    220     #Glu Val Arg Gly Glu Ileer Thr Ile Pro Asp     #240     #Lys Asp Asp Glu Ile Lyssp Asn Val Ala Val     #                255     #Asp Tyr Ile Gln Gly Glysn Asp Ser Thr Ala     #            270     #Asn Gln His Val Asn Asnsn Asp Leu Asp Val     #        285     #Val Pro Asp Ser Ile Pherp Val Phe Glu Thr     #    300     #Glu Tyr Arg Arg Glu Cyser Ser Phe Thr Leu     #320     #Thr Val Ser Gly Gly Sereu Arg Ser Leu Thr     #                335     #Leu Gln Leu Glu Gly Glyal Cys Asp His Leu     #            350     #Arg Pro Lys Leu Thr Aspla Arg Thr Glu Trp     #        365     #Glu Pro Arg Vally Ile Ser Val Ile Pro Ala     #    380     - (2) INFORMATION FOR SEQ ID NO:13:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 389 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:     #Ala Ala Ala Thr Thr Threu Pro Thr Thr Ala     #                 15     #Gly Ala Leu Pro His Serly Val Asn Ser Arg     #             30     #Arg Ser Thr Gly Ser Leula Ser Ile Arg Lys     #         45     #Val Met Ala Val Arg Thrrg Thr Val Ala Pro     #     60     #Leu Lys Glu Ala Glu Alaly Val Ala Val Gly     # 80     #Arg Met Gly Ser Leu Threu Ala Asp Arg Leu     #                 95     #Ile Ile Arg Cys Tyr Gluyr Lys Glu Arg Phe     #            110     #Thr Ile Ala Asn Leu Leuhr Ala Thr Val Glu     #        125     #Val Gly Phe Ser Thr Aspsn His Ala Gln Ser     #    140     #Leu His Leu Ile Trp Valhr Thr Met Arg Lys     #160     #Tyr Pro Ala Trp Ser Asple Glu Ile Tyr Arg     #                175     #Glu Gly Arg Ile Gly Thrhr Trp Cys Gln Ser     #            190     #Ser Gly Glu Val Ile Glyet Lys Asp His Ala     #        205     #Glu Asp Thr Arg Arg Leurp Val Met Met Asn     #    220     #Tyr Leu Val Phe Cys Prosp Val Arg Asp Glu     #240     #Asn Thr Ser Ser Leu Lysla Phe Pro Glu Lys     #                255     #Tyr Ser Thr Leu Gly Leulu Asp Pro Ala Glu     #            270     #Lys His Val Asn Asn Valsp Leu Asp Met Asn     #        285     #Pro Gln Glu Val Ile Aspal Leu Glu Ser Ile     #    300     #Tyr Arg Arg Glu Cys Glnhr Ile Thr Leu Asp     #320     #Ser Glu Ser Leu Leu Aspsp Ser Leu Thr Ser     #                335     #Asn Gly Ser Ser Val Proys Leu Glu Gly Thr     #            350     #Leu His Leu Leu Arg Sersp Leu Ser Arg Phe     #        365     #Arg Thr Glu Trp Arg Lyslu Leu Asn Arg Gly     #    380     -  Lys Pro Ala Lys Lys      385     - (2) INFORMATION FOR SEQ ID NO:14:     -      (i) SEQUENCE CHARACTERISTICS:     #acids    (A) LENGTH: 385 amino               (B) TYPE: amino acid               (C) STRANDEDNESS: single               (D) TOPOLOGY: linear     -     (ii) MOLECULE TYPE: protein     -     (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:     #Ala Val Ala Ala Met Tyrla Pro Ala Ala Pro     #                 15     #Leu Thr His Ser Arg Sersp Thr Thr Phe Ala     #             30     #Asn Val Phe Leu Cys Asnle Arg Arg Arg Tyr     #         45     #Leu Leu Ala Val Ala Thrrg Lys Val Ser Pro     #     60     #Arg Glu Ala Asp Lys Gluly Val Ala Ser Leu     # 80     #Ser Leu Thr Glu Asp Glyrg Leu Arg Leu Gly     #                 95     #Cys Tyr Glu Val Gly Ileys Phe Val Ile Arg     #            110     #Asn Leu Leu Gln Glu Valle Glu Thr Ile Ala     #        125     #Ser Thr Asp Gly Phe Alaln Gly Val Gly Phe     #    140     #Ile Trp Val Thr Ala Argrg Lys Leu His Leu     #160     #Trp Ser Asp Val Ile Gluyr Arg Tyr Pro Ala     #                175     #Val Gly Thr Arg Arg Aspln Gly Glu Gly Lys     #            190     #Val Ile Gly Arg Ala Thryr Ala Asn Gly Glu     #        205     #Arg Arg Leu Gln Lys Valet Asn Glu Asp Thr     #    220     #Phe Cys Pro Arg Thr Leulu Glu Tyr Leu Val     #240     #Ser Met Lys Lys Ile Prolu Glu Asn Asn Asn     #                255     #Leu Gly Leu Val Pro Argla Glu Tyr Ser Arg     #            270     #Asn Asn Val Thr Tyr Ileet Asn Lys His Val     #        285     #Ile Ile Asp Thr His Gluer Ile Pro Pro Glu     #    300     #Glu Cys Gln Arg Asp Aspeu Asp Tyr Arg Arg     #320     #Leu Gly Asn Ala Ala Glyhr Ser Arg Glu Pro     #                335     #Ser Pro Lys Lys Asp Glule Asn Gly Ser Val     #            350     #Arg Ser Ala Gly Ser Glyhe Met His Leu Leu     #        365     #Arg Lys Lys Pro Ala Lysys Arg Thr Glu Trp     #    380     -  Arg      385     __________________________________________________________________________ 

We claim:
 1. An isolated nucleic acid that encodes the amino acid sequence encoded by the genomic DNA sequence of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:4, or SEQ ID NO:7.
 2. The isolated nucleic acid of claim 1, which comprises the genomic DNA sequence of SEQ ID NO:4.
 3. The isolated nucleic acid of claim 1, which comprises the genomic DNA sequence of SEQ ID NO:7.
 4. An isolated nucleic acid that encodes the amino acid sequence encoded by the cDNA sequence of SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:3.
 5. The isolated nucleic acid of claim 4, which comprises the cDNA sequence of SEQ ID NO:1.
 6. The genomic clone ClTEg4 (DSM 8493) which comprises the nucleic acid sequence of SEQ ID NO:5.
 7. The genomic clone ClTEg7 (DSM 8494) which comprises the nucleic acid sequence of SEQ ID NO:6.
 8. The plasmid pNBM99-TEg1 (DSM 8477).
 9. The plasmid pNBM99-TEg16 (DSM 8478).
 10. A method of producing a transformed plant cell which comprises fatty acids of middle chain length, said method comprising the step of transferring by means of gene technology the isolated nucleic acid of claims 1 or claim 4 into a cell of a plant to form a transformed plant cell, wherein said isolated nucleic acid is expressed to yield said fatty acids.
 11. The method of claim 10, wherein the plant cell comprises capric acid (C_(10:0)).
 12. The method of claim 10, wherein the plant cell comprises myristic acid (C_(14:0)).
 13. The method of claim 11, which further comprises transferring by means of gene technology an isolated nucleic acid which codes for acyl carrier protein 2 (ACP), ketoacyl-ACP-synthase (KAS), ketoreductase, enoylreductase, or lysophosphatide-acyltransferase.
 14. The method of claim 12, which further comprises transferring by means of gene technology an isolated nucleic acid which codes for acyl carrier protein 2 (ACP), ketoacyl-ACP-synthase (KAS), ketoreductase, enoylreductase, or lysophosphatide-acyltransferase.
 15. The method of claim 10 wherein the isolated nucleic acid is transferred by microinjection, electroporation, particle gun, the soaking of parts of plants in DNA solutions, pollen or pollen tube transformation, transfer of appropriate recombinant Ti-plasmids or Ri-plasmids from Agrobacterium tumefaciens, liposome-mediated transfer, or plant viruses.
 16. The method of claim 10, further comprising the step of regenerating plants or plant parts from the transformed plant cell.
 17. A plant cell produced by the procedure of claim
 10. 18. Plants or plant parts produced by the procedure of claim
 16. 